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1. INTRODUCTION 


THE physics of the twentieth century differs fundamentally from the physics 
of the nineteenth mainly because of two theories known respectively as the 
Quantum Theory and the Theory of Relativity which form an integral part 
of its scheme of thought. It is these two theories that have enabled a far 
deeper understanding of the nature of the physical world to be attained than 
was possible at the end of the last century. One must here acknowledge 
the work of Albert Einstein who played the leading role in the development 
of both of these theories. His publications during the first two decades 
of the present century bear on every page the imprint of a powerful and 
penetrating intellect. Even after this lapse of years, the physicist of today 
will find the study of those papers a profitable and stimulating experience. 
Indeed, a good deal of what I have to say in this address only reflects the 
results of such a study in relation to the fundamental problems of the 
crystalline state of matter—a subject which has deeply interested me for 
several years past. 


> 


2. ORIGIN OF THE QUANTUM THEORY 


[The quantum theory arose from the attempt to explain the characters 
of the radiation which emerges from the window of an enclosed furnace 
heated to high temperatures. As is well known, the total intensity of such 
radiation increases rapidly with rise of temperature of the furnace. Simul- 
taneously, there is a shift of the spectral maximum of intensity towards 
higher frequencies, as is indeed evident from the progressive change in colour 
of the radiation. Thermodynamic considerations indicate that this shift 
should occur in such a manner that the spectral frequency at the point of 
maximum intensity should be directly proportional to the absolute tempera- 
ture of the furnace. Quantitative measurements confirm that this is the case 
and show that the changes in the intensity as well as in the spectral character 
of the radiation with rise of temperature agree with a formula for the spectral 
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THE physics of the twentieth century differs fundamentally from the physics 
of the nineteenth mainly because of two theories known respectively as the 
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deeper understanding of the nature of the physical world to be attained than 
was possible at the end of the last century. One must here acknowledge 
the work of Albert Einstein who played the leading role in the development 
of both of these theories. His publications during the first two decades 
of the present century bear on every page the imprint of a powerful and 
penetrating intellect. Even after this lapse of years, the physicist of today 
will find the study of those papers a profitable and stimulating experience. 
Indeed, a good deal of what I have to say in this address only reflects the 
results of such a study in relation to the fundamental problems of the 
crystalline state of matter—a subject which has deeply interested me for 
several years past. 


2. ORIGIN OF THE QUANTUM THEORY 


The quantum theory arose from the attempt to explain the characters 
of the radiation which emerges from the window of an enclosed furnace 
heated to high temperatures. As is well known, the total intensity of such 
radiation increases rapidly with rise of temperature of the furnace. Simul- 
taneously, there is a shift of the spectral maximum of intensity towards 
higher frequencies, as is indeed evident from the progressive change in colour 
of the radiation. Thermodynamic considerations indicate that this shift 
should occur in such a manner that the spectral frequency at the point of 
maximum intensity should be directly proportional to the absolute tempera- 
ture of the furnace. Quantitative measurements confirm that this is the case 
and show that the changes in the intensity as well as in the spectral character 
of the radiation with rise of temperature agree with a formula for the spectral 
intensity in which the cube of the spectral frequency appears multiplied by 
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an exponential function of a type made familiar by Boltzmann’s well-known 
principle. The argument of the exponential function is negative and has as 
its numerator the spectral frequency multiplied by one universal constant 
and as its denominator the absolute temperature multiplied by another uni- 
versal constant. A small but important modification secures a much more 
satisfactory agreement between the formula and the facts of observation. 
In the modified expression, the exponential has the same argument with a 
positive sign and now appears in the denominator with unity subtracted 
from it. This is the celebrated Planck formula except for a multiplying 
numerical factor. 


3. EINSTEIN’s DERIVATION OF THE PLANCK FORMULA 


Einstein gave a physical interpretation of the Planck formula and also 
showed how the formula could be derived on the basis of simple physical 
considerations. He interpreted the formula to mean that radiation of all 
frequencies is emitted and absorbed by material bodies in discrete quanta 
of energy proportional to their respective frequencies. He also showed 
that the radiation formula follows very naturally if we assume that the energy 
of the material particle which emits the radiation is itself quantised, in other 
words, its energy of vibration alters by successive steps, each of which is 
equal to the quantum of radiation energy which is emitted in the process. 


A more complete and logically satisfying derivation of the Planck radia- 
tion formula was given by Einstein ten years later, viz., in 1917. In that 
paper, the notion of probability which in the quantum theory replaces the 
determinism of the older physics finds a prominent place. Instead of assum- 
ing the radiator to be a harmonic oscillator as in his paper of 1907, Einstein 
dealt with the most general case of an oscillator which has a number of dis- 
crete energy levels. The probability of its being present in any one of them 
is expressed by the product of the inherent statistical weight characteristic 
of the level multiplied by the appropriate thermodynamic probability factor. 
The latter takes the form of an exponential function with a negative argument 
equal to the energy of the state divided by the product of the absolute tempe- 
rature and the Boltzmann constant. Einstein then considers the probability 
of three different kinds of elementary processes occurring in any given small 
time interval. The first is a spontaneous transition from the higher to a 
lower state of energy with emission of radiation as contemplated in Bohr’s 
theory of spectra; the second is a transition of the same nature but now 
induced by the presence of an external radiation field; the third is a transi- 
tion from the lower to the higher energy state also induced by the external 
field. The probabilities of the two latter transitions are taken as proportional 








— -_—s fC CO 











Quantum Theory and Crystal Physics 363 


to the energy density in the surrounding radiation field. A transfer of energy 
from the oscillator to the field and an absorption of energy from the field 
by the oscillator are involved respectively in the two processes. In a steady 
state of affairs, the probabilities of transition in the two opposite directions 
must necessarily balance each other. These considerations lead at once 
to the Planck radiation formula. 


4. THE CRYSTAL AS AN ASSEMBLY OF OSCILLATORS 


The foregoing exposition of Einstein’s original ideas is intended to 
furnish a theoretical background for a consideration of the fundamental 
properties of the solid state which is the subject of the present address. Ele- 
mentary processes closely analogous to those contemplated in Einstein’s 
paper of 1917 successfully describe the phenomena actually observed when a 
beam of monochromatic light traverses a crystal and the light diffused in 
its interior is examined spectroscopically. We observe in the spectrum of 
the scattered light sharply defined lines with frequencies both higher and 
lower than that of the incident radiation. The ratio of the intensities of 
each such pair of lines having equal spectral displacements in opposite direc- 
tions is found to be expressed correctly by a Boltzmann factor corresponding 
to the change of frequency multiplied by Planck’s constant, this again being 
multiplied by the fourth power of the ratio of the two spectral frequencies. 
These facts indicate that the displaced frequencies arise from transitions from 
a higher to a lower energy state and vice versa induced in the elementary 
oscillators comprised in the crystal by the incident radiation. We are thus 
naturally led to regard the crystal as an assembly of a great number of 
oscillators which form a system in thermodynamic equilibrium. The thermal 
energy of the crystal may then be equated to the sum of the thermal energies 
of all the oscillators of the different sorts of which it is composed. 


It is evident from what has just been stated that the specific heats of 
crystals stand in the closest relation to their spectroscopic properties. The 
first step in the theoretical evaluation of the thermal energy of the crystal 
is accordingly to identify and enumerate the oscillators of which it is com- 
posed and to discover and specify the energy states which they can occupy. 


5. THE OSCILLATORS AND THEIR ENERGY LEVELS 


To begin with, we may provisionally identify the oscillators with whose 
behaviour we are concerned with the groups of atoms present in the unit 
cells of the crystal structure. To discover the energy levels which these 
oscillators can occupy, we may, at least in regard to the infra-red or vibra- 
tional levels, adopt the same procedure as that which has provided itself 
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abundantly successful in the field of molecular spectroscopy. As is well 
known, that procedure consists in determining and enumerating the different 
possible modes of vibration in each one of which the atoms all vibrate with 
the same frequency and in the same or opposite phases. 


In endeavouring to carry through the procedure indicated above, the 
difficulty immediately presents itself that the group of atoms present in any 
one unit cell of the crystal structure is not isolated but forms a connected 
system with the groups of atoms in the surrounding cells and these latter 
again are connected with groups of atoms further out and so forth. The 
mathematical and physical difficulties which present themselves by virtue of 
these interconnections disappear when we make use of the fundamental 
property of crystal structure, viz., that it comes into coincidence with itself 
following a unit translation along any one of its three axes. Hence any 
normal mode of vibration should also possess the same property, viz., it 
remains a normal mode following a unit translation of the crystal. This 
requirement immediately enables us to determine and enumerate the normal 
modes in the most general case of a crystal consisting of several interpene- 
trating Bravais lattices of equivalent atoms. It emerges that the normal 
modes are divisible into two classes; in the first class, the amplitudes as 
well as the phases of oscillation of equivalent atoms in adjoining cells of 
the lattice structure are identical, while in the second class of normal modes 
the amplitudes of equivalent atoms are the same but the phases are reversed 
along one or two or all three of the axes of the lattice. If the crystal con- 
sists of p interpenetrating Bravais lattices, there are (3 p — 3) normal modes 
of the first class and 21 p modes of the second class. 


Thus the result emerges that the vibrational energy levels of a crystal 
form a sharply defined set in much the same manner as the vibrational energy 
levels in the spectra of molecules. But this result would necessarily be 
modified when the effects of anharmonicity and the interactions of the differ- 
ent normal modes with each other are taken into consideration. 


6. THE SPECTROSCOPIC BEHAVIOUR OF CRYSTALS 


The theoretical results stated above are in complete agreement with 
the actual spectroscopic behaviour of crystals in the infra-red region of 
frequencies as revealed by diverse techniques of observation in appropriate 
physical conditions. For example, they furnish an immediate explanation 
of the spectroscopic effects exhibited by crystals in the scattering of mono- 
chromatic light as mentioned earlier. In some cases the energy levels are 
shown by the spectral shifts to exhibit a lack of sharpness. That this arises 
from the disturbing effects of anharmonicity is demonstrated by cooling 








Sl lO 


eewao MN ! 








Quantum Theory and Crystal Physics 365 


down the crystal to liquid air temperature. The energy levels then become 
perfectly sharp, as is to be expected. We need not dilate here upon the 
different techniques of spectroscopic observation which are available only 
in particular cases. Mention should be made, however, of the very general 
method of observing the energy levels in crystals by the techniques of 
infra-red absorption. These latter have been greatly improved of recent years 
and the results obtained with such improved techniques completely confirm 
the theoretical findings stated above. 


A feature of special interest to which reference may be made here is in 
respect of the possibility of observing the 21 p normal modes of the second 
class in which the phases of oscillation are opposed in adjoining cells of the 
crystal structure. It is to be expected that by reason of such opposition 
of phase these modes would be precluded from observation by any of the 
available methods of spectroscopic study. Fortunately, however, and for 
reasons which I shall not here dilate upon, this is not invariably so. The 
normal modes of the second class are actually accessible to observation in 
several cases and they then manifest themselves as discrete and sharply 
defined lines in the spectra, provided the effects of anharmonicity are either 
absent or else are suppressed by the use of adequately low temperatures. 
Their appearance is one of the most striking vindications of the correctness 
of the present theoretical approach. 


7. THe Speciric HEATS OF CRYSTALS 


Regarding a crystal as an assembly of an immense number of oscillators 
in thermodynamic equilibrium, the evaluation of its thermal energy as a 
function of the temperature reduces itself to the problem of classifying and 
enumerating the different sorts of oscillators comprised in it and determining 
the scheme of energy levels for the oscillators of each sort. An application 
of Boltzmann’s principle then enables us to evaluate the average energy of 
an oscillator of that sort, and multiplying it by the number of such oscilla- 
tors we obtain a sum total; the addition of the sums thus found for the 
different sets of oscillators give the total thermal energy of the crystal. By 
differentiating this total with respect to the temperature, we obtain the 
specific heat of the crystal. 


As already stated, we have (3 p — 3) normal modes of vibration of the 
first kind and 21 p normal modes of the second kind. Thus, we have 
(24 p — 3) modes and frequencies in all and these have equal statistical 
weight. They may be regarded as the internal modes of vibration of the 8 p 
atoms contained in a volume element of the crystal whose dimensions are 
twice as large in each direction as the unit cell of the crystal structure, The 
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three omitted degrees of freedom represented the translatory movements of 
these groups of 8 p atoms each. If we leave the latter aside for a moment 
and also neglect the effects of anharmonicity, the specific heat of a crystal 
may be expressed very simply as the sum of (24 p — 3) Einstein functions, 
each involving its own characteristic frequency ; the total number of 
oscillators which contribute is the number of groups of 8 p atoms each com- 
prised in the crystal. To this sum must be added the contribution to the 
specific heat arising from the oscillations inside the crystal which are attri- 
butable to the translatory movements of these groups of 8 p atoms each. 
In a paper which has appeared in the October issue of the Proceedings of 
the Academy, it has been shown how the latter contribution may be rigo- 
rously evaluated. The argument by which this is effected may be very 
simply stated. The translatory movements of the groups of 8 p atoms each 
give rise to oscillatory movements in volume elements which comprise a 
still larger number of atoms. By taking a succession of volume elements 
of different sizes and taking note of the circumstance that the lower limit 
of frequencies of vibration thus arising would diminish in proportion to 
the increasing dimensions of the volume element, the spectral distribution 
of frequencies follows immediately. Their contribution to the thermal 
energy of the crystal is found to be expressible as an integral having a well- 


known form involving Einstein’s expression for the average thermal energy 
of a harmonic oscillator. 


It may be mentioned in conclusion that the method sketched above has 
been successfully applied to the evaluation of the specific heats of crystals— 
including especially diamond and the metallic elements—down to the very 


lowest temperatures approaching absolute zero. The theory emerges trium- 
phantly from the test. 


8. SUMMARY 


The fundamental notions of quantum theory and thermodynamics 
indicate that a crystal should be regarded as an assembly of an immense 
number of oscillators whose energy states are quantised and which form a 
system in thermodynamic equilibrium. They also indicate that the spectro- 
scopic properties and the thermal behaviour of crystals stand in the closest 
relation to each other. We are thus left with the problem of discovering 
and enumerating the oscillators of the different sorts comprised in the 
crystal and of determining their scheme of energy levels. This may be done 
by methods analogous to those which have proved successful in the field 
of molecular spectroscopy. The results obtained are in perfect agreement 
with the observed spectroscopic properties and thermal behaviour of crystals, 
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1. INTRODUCTION 


A THEORY of the specific heats of crystals has been put forward in Part I 
of this series of papers which is based on the determination and enumera- 
tion of the normal modes and frequencies of vibration of the atoms in the 
crystal about their positions of equilibrium. The theory enables the thermal 
energy of a crystal to be expressed as a function of the temperature in terms 
of these frequencies. Diamond is admirably suited for a test of the theory, 
since the frequencies of vibration of the atoms in its structure may be evaluated 
theoretically and the same frequencies also admit of precise measurement 
by several different spectroscopic techniques. The specific heat of diamond 
can accordingly be determined in terms of these frequencies over the whole 
range of temperatures for which reliable data are available. As has been 
shown in Part II of this series of papers, the theory emerges triumphantly 
from the test, its results being in complete accord with the results of the 
spectroscopic investigations on the one hand and with the measured specific 
heat data on the other. 


In the present memoir we shall consider the converse problem of deducing 
the nature of the atomic vibration spectrum for a given crystal from the 
empirically determined specific heat data. The method adopted for this 
purpose may be briefly stated here. We assume that all the atomic oscil- 
lators in the crystal have a common frequency of vibration and calculate 
from the observed specific heat at any given temperature what that frequency 
is. The frequency thus evaluated itself appears as a function of the tempera- 
ture, and a graph showing its variation over the entire range of temperature 
gives us a useful indication of how the total number of degrees of atomic 
freedom is distributed over the entire range of frequencies covered by the 
atomic vibration spectrum of the crystal. The results obtained by this 
procedure and their significance are best understood by considering an actual 
example. We shall apply the method to the analysis of the specific heat 
data for diamond and show how useful conclusions may be derived therefrom. 
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Fic. 1. Analysis of the Specific Heat of Diamond. 


2. ANALYSIS OF THE SpEcIFIC HEAT CURVE 


In Fig. 1 above, we reproduce the graph of the specific heat of diamond 
as a function of the temperature deduced from the spectroscopic data in 
Part II of the present series of papers. The abscisse in the figure are the 
absolute temperatures, while the ordinates give the calculated specific heats, 
the scale for the same appearing on the left-hand side of the figure. Taking 
the value of the specific heat given by this graph for any given temperature 
and with the aid of a table of Einstein’s specific heat function, a frequency 
of vibration is found which, if ascribed to all the atomic oscillators in the 
crystal, would give that value for the specific heat at that temperature. We 
may call the frequency thus evaluated the effective average of the atomic 
vibration frequencies at that temperature. A graph showing how this effec- 
tive frequency varies with the absolute temperature appears in Fig. 1 as a 
continuous curve; the scale of frequencies is that shown on the right-hand 
side of the figure. It will be seen that the graph is practically a horizontal 
line at the highest temperatures, the frequency having the value 1016 cm,"' 
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at 1100°; it then drops very slowly, being 1013 at 1000°, 1006 at 800°, and 
992 at 600° K. Thereafter, it falls a little more quickly, being 978 at 500°, 
954 at 400°, 911 at 300°, 826 at 200° and 767 at 160°. At still lower tempera- 
tures, the frequency drops down steeply and at 25° reaches the value 247 cm." 


The course taken by the frequency-temperature curve is readily under- 
stood if we recall the features exhibited by Einstein’s specific heat function 
for various values of the argument; the function vanishes for large values 
of the argument, while for small values it reaches a limit in the vicinity of 
which the function does not vary appreciably with the argument; inter- 
mediately, however, the function decreases progressively as the argument 
increases and at an approximately uniform rate. The specific heat curve 
which we have analysed was obtained by the summation of a set of Einstein 
functions with different arguments, giving them fractional weights propor- 
tionate to the number of oscillators having the particular frequencies. In 
these circumstances, the “‘ effective average frequency ”’ deduced in the manner 
explained would necessarily vary with the temperature; at high tempera- 
tures, the “* effective average frequency ” would be the same as the arithmetical 
average of the frequencies multiplied by their respective weights but with the 
very lowest frequencies excluded in casting the average. At moderately 
high temperatures, the effective average would continue to approximate to 
the arithmetical average, but if the temperature be so low that the Einstein 
functions for some of the higher frequencies become vanishingly small, it 
would show a marked fall and finally, when all the higher frequencies have 
dropped out in the summation, it is the few surviving oscillators with the 
lowest frequencies that would determine the effective average frequency. 
The latter would then be necessarily very small. 


The specific heat curve appearing in Fig. | was derived from a set of 
Einstein functions representing monochromatic frequencies whose values 
and respective degeneracies are the following: 1332 (3), 1250 (8), 1239 (6), 
1149 (4), 1088 (6), 1008 (4), 740 (6) and 621 (8) and, in addition, a residual 
continuous spectrum with a weight three. The arithmetical sum of all these 
vibration frequencies multiplied by their respective degeneracies and divided 
by the total of 48 is 987 cm.-! If, however, we omit the continuous spectrum 
and take the arithmetical average after division by 45, we obtain 1022 cm.-! 
as the arithmetical average frequency. This is nearly the same as the value 
of the effective average frequency at 1100° which is 1016cm.-! The course 
of the graph in the middle ranges of temperature is determined by the rela- 
tive weights of the different frequencies. It will be noticed that these weights 
are distributed in a more or less uniform manner over the entire range from 
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1332 cm.-* to 621 cm." It is this feature which is responsible for the graph 
of the effective frequency dropping quite gradually from 978 cm.-! to 767 cm-! 
in the temperature range between 500° and 160°. 


3. COMPARISON WITH THE OBSERVATIONS 


As has already been shown in Part II of this series of papers, a highly 
satisfactory agreement emerges when the specific heat computed from the 
spectroscopic data is compared with the values measured by DeSorbo in 
the temperature range from 40° to 300° and by Magnus and Hodler between 
300° K. and 1100°K. The same comparison may be made in a different 
manner, viz., by calculating from the observed specific heat at any tempera- 
ture the effective average of the frequencies of the atomic oscillators and 
plotting them on the same graph as the effective average calculated from 
the theoretical specific heat curve. This has been done and the experimental 
values are shown as circles in Fig. 1 above. The specific heats from 40° to 
300° were in the present instance taken from the table of smoothed means 
given by DeSorbo as best representing his determinations [Jour. Chem. Phys., 
21 (1953), 876]. The experimental data from 300° upwards were those 
determined by Magnus and Hodler [Annalen der Physik., 80 (1926), 808}. 
[t will be seen that over the whole range of temperatures upto 400° the 
experimental values fall smoothly on the theoretical curve. Between 400° 
and 1000° the experimental values lie about the theoretical curve, but there 
are appreciable deviations of about + 10cm. In this region of tempera- 
tures, this would correspond to variation in the specific heats of about 2 per 
cent. of the measured values. These differences may be reasonably explained 
as due to inevitable errors in the determination of the specific heats at such 
high temperatures with small quantities of the material (10 grams). 


4. ANALYsIS OF DeBYe’s SpeciFiC HEAT FUNCTION 


The values for the specific heat of diamond given by Debye’s theory 
have been analysed in the same manner as that explained above and repre- 
sented in Fig. 1 as a broken line. In making this calculation, the upper 
limit of frequency in the Debye integral has been taken to be 1332 cm.” 
which is the spectroscopically observed highest fundamental frequency. 
This limiting frequency also fits the experimentally observed specific heats 
between 450° and 1100° with an accuracy of 1 per cent., the deviations being 
as often positive as negative. It should also be remarked that the limiting 
frequency calculated from the elastic constants of diamond comes out as 
1304 cm.~! in fair agreement with the spectroscopic value of 1332 cm! 


Comparing now the continuous curve and the broken line appearing 
in Fig. 1, it will be seen that the latter lies entirely above the former in the 
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temperature range from 1100° to 140°. The broken curve crosses the con- 
Ph tinuous curve at about 140° and lies below the latter down to the lowest 
: temperatures. 
This difference in the course of the two curves is highly significant. For, 
t indicates that between 140° and 400°, the Debye theory gives consistently 
hly lower specific heats than those actually observed, while between 40° and 140°, 
he it gives higher specific heats than those observed. The actual specific heats, 
™ theoretical as well as those observed in these ranges, have been plotted in 
- Figs. 2 and 3 below, as continuous and broken curves and as circles res- 
- pectively, and exhibit this situation very clearly. 
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5. (COMMENTS ON DEBYE’S THEORY 


We shall now consider the theoretical implications which attach to the 
facts elicited by the foregoing analysis. 
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In the first place we remark that since the effective average frequency 
in the temperature range between 1100° and 500° lies close to 1000 cm.’, 
any assumption whatever regarding the distribution of frequencies in the 
atomic vibration spectrum which gives us 1000 cm.—! as the effective average 
would fit the specific heat data satisfactorily within the limits of error of 
the experimental determinations. If, for example, we assume that all the 
atomic oscillators had a frequency of 1000 cm.-!, the calculated specific heat 
would agree with the observed values in that temperature range within one 
or two per cent. Likewise, if we assume that half the atomic oscillators 
have a frequency of 1332 cm.—! and the other half have the frequency 666 cm.~" 
thereby giving us an average frequency of 999 cm.-1, the specific heat data 
would also be fitted in that range with the same measure of accuracy. It 
follows that the agreement between the specific heat theory of Debye and 
the experimental determinations in this temperature range only indicates 
that the distribution of frequencies assumed in that theory gives the arith- 
metical average of the frequencies more or less correctly. That is so, since 
the arithmetical average is three-fourths of the limiting frequency and is there- 
fore 999 cm.-} 


The second remark we have to make is that the precise course of the 
specific heat curve in the middle range of temperatures, in other words, 
between 140° and 500°, is of the highest importance in enabling us to decide 
whether or not any assumed distribution of frequencies agrees with or differs 
radically from the actual distribution. It has already been remarked that 
in this range the Debye function gives systematically a lower specific heat 
than that observed, the maximum deviation expressed as a percentage being 
about 10 per cent. at about 200° K. The present analysis makes it clear 
that this difference arises because the distribution of the frequencies con- 
templated in the Debye theory differs radically from the actual distribution ; 
instead of all the frequencies being densely crowded together near the upper 
end of the frequency range, they are actually distributed in a more or less 
uniform manner over a wide range of frequencies. It may be remarked 
that a deviation in the opposite sense, viz., with the calculated values higher 
than the observed ones, appears in the same temperature range if we assume 
that half the oscillators have a frequency of 1332 cm.-! and the other half 
a frequency of 666cm.-! This makes it clear that the actual distribution 
of frequencies does not involve a division of the atomic oscillators into two 
groups with such widely separated frequencies. 


The third and the final remark that we have to make is in respect of the 
specific heats of diamond in the lowest part of the temperature range. The 
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failure of the Debye theory to represent the course of the specific heat curve 
in this region is very conspicuous. A great many measurements were made 
by DeSorbo in this part of the temperature range and as he himself has 
pointed out, they deviate markedly from the course of the Debye function 
based on a constant limiting frequency. DeSorbo has exhibited this failure 
by drawing a graph representing the “ characteristic Debye temperature ” 
as a function of the temperature, and this exhibits a pronounced peak at 60°, 
As will be seen from our Fig. 3, the specific heat at this temperature given 
by the Debye theory assuming the limiting frequency to be 1332 cm.! is 
60 per cent. in excess of the observed value. 


Debye claimed in his original paper that the explanation of the behaviour 
of the specific heat of crystals at the lowest temperatures constituted the major 
success of his theory. Since, as we have seen, the theory actually fails most 
completely at these same low temperatures in the case of diamond, the only 
possible inference which can be drawn from the facts is that the identification 
of the thermal energy of crystals with the energy of stationary elastic vibra- 
tions in their interior on which the theory is based is a misconceived idea, 
in other words, that the theory itself is fundamentally untenable. 


6. SUMMARY 


The functional dependence of the specific heat of a crystal on the 
temperature may with advantage be expressed as a variation with temperature 
of the effective average frequency of the atomic oscillators, the same being 
determined from the argument of the Einstein function which gives the 
observed specific heat at that temperature. The usefulness of this representa- 
tion is shown in the paper by a detailed discussion of the experimental data 
for diamond. It emerges that the distribution of frequencies adopted in 
the Debye theory is irreconcilable with the observed course of the frequency- 
temperature curve. It is also pointed out that the large excess which the 
specific heat calculated from that theory exhibits over the observed values 
in the region of low temperatures shows that the ideas on which that theory 
is based are misconceived and that the theory itself is untenable. 
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Part I. A. The System Ammonium Sulphate-Water-Methanol at 30° C. 
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By G. ARAVAMUDAN 
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Received October 1, 1956 
(Communicated by Prof. K. R. Krishnaswami, F.A.sc.) 


INTRODUCTION 


A stupy of the quaternary system ammonium sulphate-ammonium nitrate- 
water-methanol is nearing completion in this laboratory and as a preliminary 
to this, determinations have been made at 30° C. of the solubilities of ammo- 
nium sulphate and ammonium nitrate in aqueous methanol solutions of vary- 
ing compositions. The results obtained are presented in this paper. 


Investigations of systems of the general type inorganic salt-water-organic 
solvent are of importance in the field of heterogeneous equilibria and the 
results obtained have been utilised for the enrichment and dehydration of 
organic solvents by the familiar salting out effect, and also in the separation 
of inorganic salt mixtures due to preferential solubility in mixtures of water 
with suitable organic solvents. The solubilities of various common inorganic 
salts in aqueous methanol solutions have been determined by many workers, 
but so far only potassium carbonate has been found to cause the formation 
of binary layers in aqueous methanol. As a rule, the solubility of nonsolvated 
salts decreases with increasing methanol concentration in the binary solvent. 
Akerlof and Turck,’ observed in the case of ten simple inorganic salts that the 
logarithm of the molality of the salt in the saturated solution (log S) bears a 
simple relation to the mole fraction of methanol (m) in the solvent and that 
this can be represented by a smooth curve that is essentially linear. The object 
of the present work was to determine the solubilities of ammonium sulphate 
and ammonium nitrate in aqueous methanol solutions at 30° C., to see if the 
two salts would resemble the others in log S—m relationship, and to observe 
the influence of methanol on the aqueous solubility of the two individual salts. 


Barring methanol, all the aliphatic alcohols that are completely miscible 
with water have been employed by previous workers in ternary studies with 
ammonium sulphate and water and it was found in every case that formation 
of binary liquid layers took place in a certain range of alcohol concentration. 
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Wibaut? noticed binary layer formation in the range of alcohol concentra. 
tions of 5 and 62% in the case of aqueous ethanol at 30° C. and that above 
90-4 weight per cent. ethanol no ammonium sulphate was dissolved. Similar 
results on the same system were obtained by other workers referred to ip 
Seidell.* Ginnings and co-workers,~’ who studied the behaviour of 
ammonium sulphate in binary mixtures of water with allyl, isopropyl and ter- 
tiary butyl alcohols have developed an empirical relationship between the 
concentrations of the salt and the alcohol in the binary liquid layers formed, 
Ammonium sulphate has only a poor salting out effect on these alcohols. 

Fleckenstein,* employed the synthetic method for the determination of 
the solubility of ammonium nitrate both in aqueous methanol and aqueous 
ethanol solutions at various temperatures. Only the results for aqueous 
ethanol and that too in graphical form are given in this paper but interpola- 
tion data on both systems are available in Abegg’s Handbuch.’ Schreine- 
makers?” also has quoted some of Fleckenstein’s values in his paper dealing 
with the solubility of ammonium nitrate in a ternary solvent consisting of 
water and the two alcohols. It was pointed out by Fleckenstein that the 
solubility of ammonium nitrate as well as its increase in value with rise in 
temperature was much more in aqueous methanol than in aqueous ethanol 
solutions. He further observed that whereas addition of ethanol depressed 
the solubility of ammonium nitrate in water, methanol increased it con- 
tinuously. The latter statement is incorrect by the light of his own results 
and those of the present work which are shown in Table III. At high tempe- 
ratures binary layer formation was observed by Fleckenstein in aqueous etha- 
nol but he did not specify the exact conditions. deWahl'! also determined 
solubility of ammonium nitrate in aqueous ethanol at 0°, 30° and 70°C. 
and reported the occurrence of salting out phenomenon only after 67-5 °C. 
Very recently, Thompson and Vener! determined at 5° intervals the solubility 
and density isotherms of the system ammonium nitrate-water-ethanol from 
25-75°. The plait point for the two liquid phase region was found accurately 
to be at 65-7° and its composition by weight was 54% salt, 33% alcohol and 
13% water. Ammonium nitrate salts out the higher alcohols even at room 
temperature. Thus, according to Thompson and Molstad,' in the case of 
isopropyl alcohol binary solutions appear at 30° after about 15 weight per 
cent. alcohol. Ginnings and co-workers developed equations relating the salt 
and the alcohol concentrations in the conjugate solutions caused by ammo- 
nium nitrate in aqueous f-butyl® and aqueous allyl’ alcohols. 

EXPERIMENTAL 

Materials used—Ammonium sulphate of three different particle sizes 

was employed to test the effect of particle size on the equilibration period. 
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Large crystals of ammonium sulphate were obtained by slow recrystallisation 
of pure Merck sample from aqueous solution. Particles of 20-mesh size were 
graded from a B.D.H. analaR variety. Lastly, a very fine powdery form of 
less than 100-mesh was obtained by precipitation from a saturated solution 
in water with methanol. The precipitated salt was collected, washed with 
acetone, exposed to dry atmosphere and finally dried at 105°. All the samples 
were of high purity as tested according to Rosin. 


Ammonium nitrate was of B.D.H. analaR grade and was taken in both 
granular and powdery forms. It was dried at 110° before use. 


Absolute methanol was prepared according to Lund and Bjerrum.” 
Density determinations gave d*} as 0-79142 in good agreement with litera- 
ture data. Qualitative tests for the common impurities established its purity. 


Pure distilled water was used throughout the investigations. 


Apparatus and experimental procedure.—Initial observations showed that 
with ammonium sulphate and ammonium nitrate two liquid phases were not 
formed in aqueous methanol solutions at the experimental temperature. 
Further, the fact that the salts are not solvated by either of the solvents, greatly 
simplified the study of these ternary systems. 


The solubility experiments were conducted in an electrically heated 
water thermostat maintained at 30° + 0-02° with the help of an electronic 
relay. Experiments involving approach of equilibrium from lower tempera- 
ture side were conveniently conducted in an apparatus whose details have 
been recently published by Aravamudan and Krishnaswami.'® It is similar 
to that devised by Campbell!” with the modification that it is enclosed in a 
water-tight glass jacket provided with screw caps at the ends, to prevent con- 
tact of the bath liquid with the solubility bottles. The apparatus charged with 
the several complexes was rotated at about 100 r.p.m. and inverted for auto- 
matic filtration after reaching equilibrium. This method gave very accurate 
results since the separation of saturated solution from excess solid occurs in 
a closed system under thermostatic conditions. Experiments from the super- 
saturated side were made by just suspending in the thermostat the leakproof 
stoppered bottles containing the mixtures of excess solid and solvent pre- 
viously supersaturated at 37° for three hours. After the requisite period, 
the saturated solution was sucked, after the settling of the solid, through a 
pipette warmed to just above 30° and fitted with a plug of glass-wool at its 
tip. Extreme care was taken to minimise the inherent errors in this method 
which should be deemed less accurate than the one followed for undersatura- 
tion side experiments. 
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In the case of ammonium sulphate, two different methods were employed 
for the determinations and they gave identical results. In the first procedure, 
a known amount of dry salt was dissolved in a definite weight of water and to 
this was added enough absolute methanol to precipitate out part of solid 
from solution. The amount of methanol added was equal to the difference 
in initial and final weights. The complex was then equilibrated at 30° 
approaching equilibrium from both undersaturated and supersaturated sides. 


In the second method, which could be and was followed for both 
ammonium sulphate and ammonium nitrate, excess of the salt was added to 
aqueous methanol solutions of known composition and the equilibrium estab- 
lished as before from both sides. 


A stirring period of 24 hours was found to be sufficient to ensure attain- 
ment of equilibrium in the case of ammonium sulphate irrespective of its 
particle size. Equilibrium obtained much quicker in the case of ammonium 
nitrate. Experiments conducted with the complexes of same composition 
approaching equilibrium from the opposite sides of saturation at 30° gave 
closely agreeing results. Usually the weights of the components were so 
chosen as to yield about 25 gm. of saturated solution and leave out about 
2-3 gm. of solid in excess. However, larger samples of the order of 50-70 gm. 
were taken in cases where the ammonium sulphate content was very low for 
convenience in ultimate analysis. 


Analytical methods.—Due to the absence of solvated solid phases and 
also of binary liquid phases, both the ternary systems at 30° breakdown into 
a continuous series of binary systems of ammonium sulphate or ammonium 
nitrate and a solvent of varying composition from pure water to absolute 
methanol. As the relative amounts of water and methanol were known 
beforehand it was only necessary to estimate salt content in the saturated 
solutions in order to compute the relative fractions of the three components. 


Weighed amounts of saturated solutions were diluted with water and 
weight fractions of the diluted solution were subjected to analyses as follows: 


1. Evaporation of the solution and drying of the salts to constant 
weight at 110° was suitable when methanol content was low. The method 
was accurate and precise to within + 0-05%. 


2. After removal of methanol by evaporation, the solution was ana- 
lysed for the ammonium radical by the formal method based on the reactions, 
2 (NH,).SO, -+- 6 HCHO—» (CH2).Na -- 6 H,O + 2 H,SO, 

4 NH,NO, ++ 6 HCHO —-» (CH,),N, + 6 H,O + 4 HNO; 
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The liberated acid was titrated with standard alkali. The values obtained 
in this method were reproducible to + 0-10%. 


3. The sulphate content was estimated by direct titration with standard 
lead nitrate solution after addition of alcohol and acetone and employing 
potassium iodide as internal indicator. The method is simple, rapid and 
accurate, reproducibility being + 0°3%. Full details of the method will be 
published elsewhere. 


4. In cases where the ammonium sulphate was present only in small 
amounts, gravimetric estimation of sulphate was carried out as lead sul- 
phate. A suitable method for this purpose has been developed by the author 
the details of which will be published separately. Co-precipitation of nitrate 
which always occurred causes a positive error of one per cent. and this had 
to be corrected accordingly. 


RESULTS AND DISCUSSION 


The results of the investigations on ammonium sulphate are presented 
in Table I, the composition of the saturated solutions being expressed in weight 
percentage. The weight percentage and mole fraction of methanol in the 
corresponding binary solvents and the calculated values for the logarithm of 
the molality of ammonium sulphate in saturated solution are also given. 
Results obtained for the ammonium nitrate system have been recorded simi- 
larly in Table I. 


It is seen that the solubility of ammonium sulphate falls off markedly 
with increasing proportion of methanol in the solvent and reaches extremely 
small values in regions of high methanol contents. The solubility undergoes 
a vast change from 43-92% in water to less than 0:05% in absolute methnol. 
Methanol is known to have an even more adverse effect on the aqueous solu- 
bility of potassium sulphate which is isomorphous with the ammonium salt, 
according to the results of Akerlof and Turck. Kirn and Dunlap have 
reported that potassium sulphate is practically insoluble in absolute methanol 
whereas sodium sulphate dissolves to the extent of about 0:025% at 30°. 
The aqueous solubilities of alkali metal sulphates are much more affected by 
the presence of alcohols than those of the corresponding nitrates. The 
author’s results of the ammonium nitrate system studied by the analytical 
method are fortuitiously in close agreement with those of Fleckenstein who 
employed the synthetic method which in most cases is unreliable and only 
approximate in its results. The influence of methanol on the solubility of 
ammonium nitrate in water is at variance with the cases for sodium and 
potassium nitrates as reported by Akerlof and Turck. The solubility of 
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TABLE | 
System (NH,),SO,—H,O—CH;0OH at 30° C. 

Weight Logarithm of 100 gm. of saturated solution 
per cent. of Mole fraction molality of contain in gm. 
methanol in of methanol (NH,),SO, in 

solvent in solvent saturated (NH,).SO, H,O CH,OH 

solution 
ied - 0-7730 43-92 56-08 a 
4-86 0-028 0-7235 41-14 56-00 2-86 
7:58 0-044 0-6885 39-20 56-19 4-61 

14-16 0-085 0-5946 34-10 56-57 9-33 

17-12 0-104 0-5397 31-41 56-85 11-74 

19-60 0-121 0-4909 29 -03 57-06 13-91 

24-71 0-158 0-3703 24-00 57-22 18-78 

29-85 0-193 0-2514 18-99 56-82 24-19 

a 0-220 0- 1326 ae ‘i me 

37-36 0-247 0-0199 12-05 55-10 32-85 

41-37 0-284 —0-1167 9-17 53-25 37-58 

44-77 0-313 —0-2490 6-93 51:34 = 41-63 

50-17 0-362 —0-4630 4-35 47-66 47-99 

55-09 0-408 —0-6647 2-78 43 -66 53-56 

59-89 0-457 —0-8544 1-80 39-39 58-81 

63-53 0-495 —1-001 1-30 36-00 62-70 

69-98 0-567 —1-+292 0-67 29-83 69-50 

73°77 0-613 —1-476 0:44 26-11 73-45 

79-79 0-689 —1-758 0-23 20-17 79-60 

83-79 0-744 —1-920 0-16 16-18 83-66 

TABLE II 
System NH,NO,—H,O—CH,0H at 30° C. 

Weight Logarithm of 100 gm. of saturated solution 
per cent. of Mole fraction molality of contain in gm. 
methanol in of methanol NH,NO, in 

solvent in solvent saturated NH,NO, H,O CH,OH 

solution 
a = 1-462 69-86 30-14 i 
7°22 0-0419 1-430 68-29 29-43 2:29 
14-28 0-0857 1-399 66-73 28-52 4-75 
25-70 0- 1629 1-342 63-74 26-94 9-32 
39-60 0-2696 1-256 59-06 24-73 16:21 

44-87 0-3136 1-210 56-43 24-02 19-55 

53°73 0-3950 1-136 52-24 22-11 25-65 

59-89 0-4565 1-065 48-19 20-78 31-03 

64-21 0-5021 1-010 45-13 19-64 35-24 

73-77 0-6128 0-872 37-37 16-43 46-20 

76-80 0-6507 0-826 34-91 15-10 49-99 
83-79 0-7347 0-710 29-10 11-50 59-40 
85-37 0-7663 0-671 27°31 10-64 62-05 
93-89 0-8964 0-512 20-66 4-85 74-49 
100 1-000 0-402 16-80 ‘i 83-20 
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ammonium nitrate in absolute methanol is fairly high for an inorganic salt 
in an alcohol. 


The great difference in the aqueous solubilities of ammonium sulphate 
and ammonium nitrate in presence of methanol is best indicated in Table III 
wherein are given the amounts in gm. of the two salts dissolving in 100 gm. 
of water in various mixtures of water and methanol. There is a marked 














TABLE III 
Weight 100 gm. of water in solvent 
per Cent. of dissolve at 30° (in gm.) 
methanol in 
solvent NH,NO, (NH,).SO, 
ses 231-8 78-32 
14-28 234-0 60+ 10° 
25-70 236-6 39-88 
39-60 238-9 19-54 
44-87 234-9 13-25° 
53°73 236-2 7: 
59-89 231-9 4:6 
64°21 229-8 °F 
73°77 227-5 1-7 
76-80 231-2 1 -3° 
83-79 253-0 0-96 
85-38 256-6 en 
93-89 426-3 





e = Interpolated value 


diminution in the values for ammonium sulphate as the fraction of methanol 
in the binary solvent is raised whereas in the case of ammonium nitrate, the 
value remains much the same over a wide range and after about 85 weight 
per cent. methanol it actually increases greatly. It may be noted that the 
cooling produced during the dissolution of ammonium nitrate decreases in 
intensity as the methanol fraction is stepped up in the binary solvent and per- 
sists even in absolute methanol. Temperature variations during dissolution 
of ammonium sulphate are hardly perceptible. 


The solubility data on the two ternary systems have been plotted on the 
weight per cent. scale on the triangular diagram in Fig. 1. The saturation 
curve ABCD for ammonium sulphate shows a slight increase in the weight 
fraction of water in the saturated solution over the initial part AB, then a 
gradual decrease over BC and a rapid fall in the region CD. The fall in solu- 
bility of ammonium sulphate is relatively greater in regions of high methano] 
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concentrations as indicated by the steep curve CD. The saturation curve is 
typical of the solubility behaviour of most of the inorganic salts in aqueous 
alcohols. A very large section of the triangle is occupied by the saturation 
field and the line MA representing all mixtures of methanol with the saturated 
solution of ammonium sulphate in water is well within the solubility region 
signifying copious percipitation of ammonium sulphate from its aqueous 
solution by the addition of methanol. The curve EFG is obtained for the 
ammonium nitrate system and is almost unique. The line ME only grazes 
the saturation curve indicating that ammonium nitrate can scarcely be preci- 
pitated out of its solution in water by addition of any amount of methanol. 
This property is perhaps met with only in the case of a very few highly soluble 
salts. 


The wide variation in the solubility of ammonium sulphate with the sol- 
vent composition does not lend itself to accurate plotting by means of the 


triangular diagram. Rectangular plots would be more accurate and easy” 
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to read. In Fig. 2, the three types of conventional curves are drawn. The 
weight per cent. of methanol in the binary solvent is taken as the ordinate 
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against which are compared either (I) the weight per cent. of ammonium 
sulphate in the saturated solution, (II) the weight of salt dissolving in 100 gm. 
of solvent or (III) the weight of salt dissolving in 100 gm. of water in the 
solvent. Curve III has the best spread and can be used for interpolation of 
salt solubilities in solvents of known composition whereas curve II is of more 
general use. 


In Figs. 3 and 4 are given plots of mole fraction of methanol in binary 
solvent against the logarithm of the molalities of ammonium sulphate and 
ammonium nitrate respectively in the corresponding saturated solutions. In 
the case of ammonium sulphate there is perfect linearity in the region of 0-15 
and 0-70 mole fraction of methanol, the curve having only a gentle bend 
before and after these limits. In the case of ammonium nitrate, the curve has 
two linear portions with different slopes intersecting at about 0-35 mole frac- 
tion methanol. Apart from their utility in the understanding and interpre- 
tation of solubility phenomena, these curves make possible better interpola- 
tions than can be obtained from the conyentional solubility diagrams, 
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SUMMARY 


The solubility isotherms of the ternary systems ammonium sulphate- 
water-methanol and ammonium nitrate-water-methanol have been deter- 
mined at 30° employing an efficient solubility apparatus. Neither of the two 
systems reveals the formation of solvated solid phases or of binary solutions. 
Methanol greatly reduces the aqueous solubility of ammonium sulphate but 
it has little effect on that of ammonium nitrate when added in moderate 
amounts. Very large amounts of methanol, however, actually increase the 
solubility of ammonium nitrate in water. Salt solubilities in a binary solvent 
of varying composition have been represented diagrammatically. The rela- 
tionship between the logarithm of molality of the salt in solution and the 
mole fraction of methanol in the binary solvent in both the systems is in accord 
with the general observations of Akerlof and Turck. 
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1. INTRODUCTION 


Ir is well known that if a beam of X-rays is reflected from a crystal at a Bragg 
angle of 45°, the reflected monochromatic beam should be completely pola- 
rised, as it is deviated exactly at right angles to the incident beam. 


This paper is concerned with an experimental arrangement very closely 
approximating to this ideal arrangement and a test of its performance. It 
was found that the fraction of unpolarised X-rays in the beam was only 
0-2 per cent., which is comparable to the performance of polaroids for light. 
The study was made during the course of an investigation of the texture of 
crystals by means of polarised X-rays (Ramaseshan and Ramachandran, 
1954). 

2. EXPERIMENTAL DETAILS 


A single crystal of copper was mounted in front of an X-ray tube with 
copper target and the crystal was set to give the 311 reflection, whose Bragg 
angle @ is 45° 9’ for the CuKa radiation (1-542 A). The arrangement was 
the same as described earlier (Chandrasekaran, 1955 a,b). The degree of 
polarisation in the reflected beam was analysed by using a second crystal of 
copper set for the same reflection. The surface of this crystal was lightly 
ground to render it mosaic. The polarised beam was incident on this crystal 
along the rotation axis of the spindle carrying the goniometer on which the 
crystal was mounted and in this way the plane of reflection of the second 
crystal could be rotated relative to the plane of vibration in the polarised 
beam. The integrated reflection p(¢) was measured at each azimuth ¢ with 
respect to the plane of polarisation by means of a Geiger counter (for details, 
see Chandrasekaran, 1955). A monitor was used to allow for variations of 
incident intensity. The data obtained are given in Table I where r(¢) 
stands for the ratio p (¢)/p (0), ¢ = 0 corresponding to the position when p (¢) 
is maximum, that is, plane of reflection perpendicular to the plane of vibra- 
tion, 
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TABLE I 
Measured values of r(¢) 





Mean value cos? (¢) 
of r (9) calculated, 





in degrees measured,t in units 
in units of 10-* 
of 10-% 
60 249 250 
65 174 179 
70 121 117 
75 71 67 
77°5 52 47 
80 32 30 
82:5 17 17 
85 8-7 7°6 
87-5 2°8 1-9 
90 1-8 0 
92-5 4-7 1-9 
95 11-1 7:6 
97-5 19 17 
100 28 30 





+ Mean of several measurements for both -+- ¢ and — ¢. 


3. THEORETICAL CONSIDERATIONS 
Theoretically, the variation of r (¢) with ¢ is given by the formula 


r (¢) = cos? (¢) + a sin? (¢) 


where a is the value of r(¢) for ¢ = 90° and depends on the state of perfec- 
tion of the crystal. If it were ideally mosaic, the value (a,) would be cos? 28. 
If it were perfect, the value (apy) would be | cos 26|. However this is true 
only for a symmetric surface reflection from a crystal of negligible absorp- 
tion. In the actual conditions of experiment, the reflection from the copper 
crystal was asymmetric (the acute angle between the surface and the reflect- 
ing planes being 25°) and further absorption (u of copper = 466 cm. for 
CuKa radiation) was quite large. The value of a for a perfect crystal, taking 
these factors into account, (ap,) was calculated using the formula developed 
by Hirsch and Ramachandran (1950). The relevant results are given in 
Table II. The values of the integrated reflection p are for polarised X-rays 
and for ¢= 0°, 
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TABLE II 
Calculated values of « and integrated intensity for different 
states of perfection 
arm = cos 20 =5:24~x 10°% 
apa = 6:45 x 10% pea = 72-7 microradians 
on = cos* 20 = 2-75 x 16° pu = 427 microradians 


4. COMPARISON OF THEORY WITH EXPERIMENT 


It is seen from Table II that the theoretical values of a are quite small, 
being about 6 x 10~ for a perfect crystal and as low as 3 x 10° for a mosaic 
crystal. The experiments were made with a lightly ground crystal, as men- 
tioned earlier. However, the minimum observed value of r(¢) was only 
1-8 x 107°. With this value of a the measurements are all consistent, as 
may be seen from Table III. 


TABLE III 


Comparison of observation with the theoretically calculated 
r(¢) assuming «= 1-8 x 10-3 





| 


d r (f) exp. r (¢) theor. | > r (¢) exp. r (¢) theor. 
60° 249 251 | 80° 30 32 

65 174 180 | 82-5 18 19 

70 121 119 | 85 9-9 9-3 
75 71 69 87-5 3°8 3-7 
77°5 52 49 90 1-8 1-8 





The state of perfection of the second crystal was, therefore, assessed by 
measuring its integrated intensity for ¢ = 0° in absolute units. This was 
done by comparing its integrated reflection with that for a surface reflection 
400 of a ground rocksalt crystal. The latter was corrected for secondary 
extinction. The final value obtained for the 311 reflection of copper, namely 
222 micro-radians, is expected to be correct to 5 per cent. By comparing this 
with the calculated values in Table II, it is seen that although the grinding had 
not rendered the crystal ideally mosaic, its perfection is quite low. Thus, 
a for this crystal must be well below 6 x 10~ which definitely disagrees with 
the measured value of 1-8 x 10-%. 
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5. ORIGIN OF THE DISCREPANCY 


In view of this, attempts were made to check whether the high observed 
value of a was due to a double reflection. A double reflection 111 +713 
= 004 has in fact been reported for copper (Rachinger, 1950). For this pur- 
pose, the crystal was rotated in its own plane by arbitrary angles and the ex. 
periments were repeated at each setting, but in all cases, the measured values 
of a were consistently close to the value in Table I. 


Another possible cause for the occurrence of such a depolarisation jis 
the divergence of the incident beam as well as that of the reflected beam, 
caused by the mosaicity of the crystal. In light scattering, it is well known 
that divergence leads to depolarisation and has to be corrected for. However, 
such an effect cannot occur here, because each crystallite would reflect X-rays 
exactly at 0, the deviation would also be 28, with a width of the order of 
seconds of arc (provided the crystallite size is not too small). The divergence 
arises only due to disorientation of the crystallites and cannot, therefore, 
lead to a depolarisation. There was also a divergence in the plane of reflec- 
tion from the first crystal. This was limited in the experiment by a collimator 
and was estimated to be about 1-2°. This again cannot lead to a depolarisa- 
tion of the order observed in these experiments. The exact reason for the 
large observed value was finally traced to the fact that the polarising crystal 
was copper and therefore emitted a small amount of fluorescent CuKa 
radiation. It is well known that fluorescent radiation is always unpolarised 
—in fact it is so, even if the exciting radiation is polarised. 


It is interesting to note that the ratio of the intensity at 0° and 90° is 
nearly 500: 1 and it is believed that this is the best extinction of polarisation 
ever obtained with X-rays and compares very favourably with the performance 
of optical polaroids. This ratio could in fact have been better, but for the 
emission of fluorescent CuKa radiation by the polarising crystal. 


6. SUMMARY 


The performance of an experimental set-up for polarising X-rays, used 
by the author in the study of crystal perfection, is discussed. A value of 
0-2 per cent. for the percentage of unpolarised X-rays believed to be the best 
reported, was obtained, using the 311 reflection (Bragg angle 45° 9’) of a copper 
crystal, with CuKa (A = 1-:542 A) radiation. 


Theoretically, the perfection of polarisation should have been even better, 
but for a small amount of fluorescent CuKa radiation emitted by the polaris- 
ing crystal. 
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1. INTRODUCTION 


SODIUM CHLORITE which is present in the commercial product ‘ Textone’ is 
generally prepared by the action of chlorine dioxide on hydroxides or per- 
oxides of alkalis and alkaline earths. The only reference on the use of per- 
salts appears to be a qualitative observation of Levi! that persulphates do 
not react with chlorine dioxide to give the corresponding chlorites but that 
percarbonates and perborates do so. In connection with the investigations 
started on the study of the reactions of perborates with both organic? * and 
inorganic substances, the quantitative data on the formation of sodium 
chlorite by the action of chlorine dioxide on sodium perborate under 
different conditions were obtained and are reported in this paper. 
2. EXPERIMENTAL 

In the experimental procedure, a convenient mol. fraction (varying from 
0-005 to 0-05) of the reactants (either sodium perborate, hydrogen peroxide 
or any other reagent) was diluted to 100 c.c. and treated at 29-30° C. with 
chlorine dioxide prepared by oxalic acid method‘ till the reaction was comp- 
plete. Excess of chlorine dioxide was then removed by aeration. The result- 
ing colourless solution was estimated for its chlorite content from an aliquot 
portion using the method of White.° The data obtained are given in Table I. 





TABLE I 
Amount of Amount of Yield of 
NaBO,.4H,O NaClO,in gm. NaClO, % of 
Mol. fr. formed inthe theory = P 
mixture = Y 

0-005 0-565 62-48 
0-010 1-130 62-49 
0-02 2-261 62-49 
0-05 5-517 60-99 
0:1 9-993 55:23 


0-2 16-550 45-73 
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The data in Table I show that the yield of sodium chlorite as per cent. 
of theory is practically constant in the beginning and then gradually dimi- 
nishes. This behaviour appears to be due to the fact that sodium chlorite 
decomposes in higher concentrations since larger amount of sodium chlorite 
is produced with higher proportions of sodium perborate in the same total 
volume. 

































To note the effect of change of temperature experiments were carried 
out at 15°, 30° and 35° C. which revealed that there is practically no change 
' in the yield of sodium chlorite when the temperature is changed from 15-35° C. 
. This showed that the reaction between chlorine dioxide and sodium perborate 
could be conducted without any need for elaborate temperature control. 
All experiments, therefore, were carried out at the room temperature which 
was 29-30° C. 


Another series of experiments carried out in order to see if the yield could 
be increased by the addition of hydrogen peroxide, boric acid or borax to 
the reaction mixture following the same procedure gave data which are dis- 
cussed below (Table II). 


a ll 








TABLE II 
Mol. fr. of NaBO,.4 H,O 0-005 0-01 
= eee 
Molecular ratio of ¥ P ¥ P 
NaBO,: H,O, 
{ 1:0°5 0-565 41-64 1-13 41-66 
{1 0-565 31-23 1-13 31-245 
1:1°5 0-546 24-14 1-062 23-49 





In the above Table II, the amount of sodium chlorite (Y) is unaffected 
by the presence of hydrogen peroxide but when considered on the basis of 
available oxygen in the mixture theoretically obtainable they are far lower 
as shown under P. In other words formation of sodium chlorite appears to 
depend on the sodium perborate present in the mixture and not on the available 
oxygen alone. The yield remained similarly unaffected when increasing 
amounts of boric acid were added. 


Experiments on the action of chlorine dioxide on sodium perborate in 
the presence of borax revealed that the yield is greatly increased as given in 


Table III below: 
A3 
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TABLE Ill 


Mol. fr. of NaBO,.4 H,O 0-005 0-01 0-02 0-05 











Molecular ratio of 
NaBO,: Na,B,O, Y P Y P x P Y P 





ry wr as *-+ fs ry we 


1:0°5 0-87 96°18 1-741 96:23 3:48 96:23 8-107 89-42 
L:1 0-87 96-18 1-741 96:23 3:48 96-23 8-107 =. 89-42 
b: 3-5 0-87 96-18 1-741 96:23 3-48 96-23 8-107 =. 89-62 





————— ’ 

The addition of borax therefore increases the yield though no advantage 

is gained by adding increasing amounts of the same. The lower percentage 

in the case of 0-05 mol. fraction is attributed to the decomposition of sodium 

chlorite at higher concentrations. This means that the yield of sodium 

chlorite is dependent upon the alkalinity of the solution and not upon the 
available oxygen alone. 


® oat @ Qos -f 


To verify this conclusion experiments were carried out in which sodium 
perborate was dissolved in sodium hydroxide of different normalities (N) 
making up the volume to 100c.c. The data obtained are given below in 
Table IV. <i 


TABLE IV 








Mol. ft 
Mol. fr. of NaBO,.4H,O —_ 0-005 0-01 0-02 — 





N of NaOH 0-058 0-116 0-232 





Y 0-870 1-741 3°48 0-005 


= an < 


P 96-18 96-23 96-23 ee 


ne 0-01 
From the data in Table IV it appears that by adding sodium hydroxide 


of suitable concentration, in place of borax, to the reaction mixture the same eve 
yield of sodium chlorite is possible. Experiments carried out by passing 0-02 
chlorine dioxide through solutions of sodium hydroxide of various normalities F 
showed that the amount of chlorite formed is very small compared with that Bees 
obtained when mixtures of sodium perborate and alkali are used. 0-05 
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In the reaction of chlorine dioxide with sodium perborate alone the yield 


— reaches a maximum of about 62-5% but the same can be augmented by in- 
creasing the alkalinity by the addition of borax or sodium hydroxide. No 
— advantage is gained by the addition of hydrogen peroxide or boric acid alone 


nor is any appreciable amount of chlorite formed with sodium hydroxide 
alone. Thus it appears that it is the available oxygen as well as the alkalinity 


of the solution which taken together give nearly theoretical yield of sodium 
89-62 chlorite. 


89-62 Chlorine dioxide gave no chlorous acid with hydrogen peroxide alone 
but if alkali or alkaline substance was also present along with hydrogen per- 

















89-62 oxide appreciable quantities of chlorite were detected as will be seen from 
nd Tables V and VI. 
: TABLE V 
Mol. fr. of H,O, 0-005 0-01 0-02 
N of NaOH 0-058 0-116 0-232 
Y 0-5652 1-130 2-26 
P 62-48 62-49 62-49 
TABLE VI 





Molecular ratio of H,O, and 





Mol. fr. of H,O, Na,COs NaHCO, Na,B,0,.10 H,O 





1:0°5° 1:1 I:1-S 4:@-5 | ba | T:tS 2:6-5 Fel I: 1-5 














Y 0-406 0:565 0-588 0-216 0-452 0-678 0-407 0-825 0-880 
= P 44-88 62°46 65:0 23°89 49:99 74:99 44-99 91-20 97-45 
Y 0-814 1-153 1-176 0-452 0-905 1-357 0-846 1-650 1-760 
7 P 44:99 63:73 64:98 24:99 49-99 74:98 46-76 91-20 97-45 
' Y 1-628 2:306 2-351 0-905 1-831 2-714 1-673 3-323 3-527 
" P 44:99 63:73 64:97 24:99 50-60 75-01 46:22 91-84 97-45 
We 4-115 5-778 stieas 2:261 4-545 6-76 4-115 8-185 8-795 
P 45:49 63-98 64-74 25-0 50-2 74:73, 45:49 90-48 97-23 
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Results in Table V indicate that a solution of sodium perborate behaves 
in its reaction with chlorine dioxide in the same way as a solution containing 
equivalent amounts of sodium hydroxide and hydrogen peroxide. The par. 
ticular advantage in preferring sodium perborate to hydrogen peroxide and 
sodium hydroxide mixture is that it is solid and stable and can be most easily 
and conveniently handled. 


In Table VI the higher yield (nearly 75% for the ratio 1: 1-5) in the case 
of sodium bicarbonate than in case of sodium carbonate is probably due to 


the instability of hydrogen peroxide in the presence of alkali. The use of 
borax in place of sodium hydroxide, sodium carbonate or sodium bicarbonate 
gives an appreciable increase in the yield of sodium chlorite. When these 
values are compared with those in Table III where sodium perborate is used 
instead of hydrogen peroxide,{the}yield_is found,to be very nearly the same. 


Hydrogen peroxide in the presence of boric acid gave no chlorous acid 
with chlorine dioxide. 


Experiments to observe the yield of sodium chlorite by taking the combi- 
nation: hydrogen peroxide, sodium perborate and borax gave results which 
are tabulated below: 




















TABLE VII 
Mol. fr. of H.O, 0-005 0-01 0-02 
Molecular ratio of ei P ¥ P Y P 
H,O,: NaBO,: Na,B,0, 
0°5:1:1 ee 96-6 2:62 96-54 5-2 95:8 
P23 .. 1-402 77-49 2°8 77-38 5-607 77:48 
1-5:1:1 ~~ £380 61-02 2-787 61-48 5-584 61:72 
ee ee ee pce na ee 








It is seen from the above table that the yield of sodium chlorite is nearly 
the same as that obtained with sodium perborate and sodium hydroxide. 
Hence one might take hydrogen peroxide, sodium perborate and borax or 
sodium perborate and sodium hydroxide for the preparation of sodium 
chlorite. However, due to the instability of hydrogen peroxide in the pre- 
sence of alkali the former is not suitable but a mixture of sodium perborate 
and sodium hydroxide is more convenient. 
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3. SUMMARY 


Chlorine dioxide and sodium perborate gtve sodium chlorite and the 
yield is not affected by the addition of hydrogen peroxide or boric acid in 
different ratios but is greatly enhanced in the presence of borax, sodium 
hydroxide or sodium carbonate. The reaction between chlorine dioxide and 
hydrogen peroxide or alkali alone is negligible. The chlorite formation is 
determined by the available oxygen and the alkalinity of the solution. By 
proper adjustment of experimental conditions nearly theoretical yield of 
sodium chlorite is obtained. 
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§ 1. INTRODUCTION 


Two beams in different states of polarisation are said to be incoherent when 
they cannot be made to interfere even after being resolved into the same state 
of vibration by the use of an analyser. Thus, if unpolarised light be incident 
on a transparent wedge of quartz the fact that the two oppositely polarised 
beams into which it is split are incoherent is experimentally demonstrated 
by the complete absence of interference effects using an analyser alone. 
Similarly we can speak of two polarised beams as being completely coherent 
only when, by the use of a suitable analyser, interference effects of maximum 
clarity can be produced—the interference minimum having zero intensity. 


As was remarked in Part I (Pancharatnam, 1956) if an extended source 
of light be viewed through a plate of an absorbing biaxial crystal cut normal 
to an optic axis, faint interference rings can be seen by the use of an analyser 
alone behind the crystal plate—even with the incident light unpolarised. 
It follows that when a pencil of unpolarised light falls on such a medium, the 
two non-orthogonally polarised pencils into which it is split must be regarded 
as partially coherent—since they satisfy neither the test for incoherence 
nor that for coherence, given in the previous paragraph. Hence, a discus- 
sion of the interference phenomena presented with an analyser alone involves 
really the analysis of the interference of two partially coherent beams which 
are resolved to the same state of elliptic vibration by the use of an analyser 
(§ 4). Even when both polariser and analyser are absent, rudimentary traces 
of an interference pattern may be seen. This requires the analysis of the 
following problems: the direct interference of two partially coherent beams 
in different states of polarisation (§ 6); the composition of two such partially 
coherent beams to form a partially polarised beam—at the second face of the 
crystal (§ 6); and the converse process—occurring at the first face of the 
plate—of decomposing unpolarised light (or, more generally, incompletely 
398 
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polarised light) into two completely polarised vibrations which, we shall 
find, are partially coherent (§ 7). In §8 we shall consider the addition of n 
partially coherent beams which are all completely polarised. 


We may mention that without resort to the use of absorbing biaxial 
crystals, it is easily possible to produce two partially coherent polarised 
beams. For example, we will have two such completely polarised and phy- 
sically separate pencils of light emerging from a rhomb of calcite if we allow 
a single narrow pencil of partially polarised light to fall on the first face. 
The test of their partial coherence is as before the fact that even after being 
resolved into the same state of vibration by the use of an analyser, their 
interference is only partial—as may be seen in a conoscopic arrangement. 
[Our study of such partially coherent beams in opposite states of polarisa- 
tion (§ 5) has an interesting theoretical consequence: it reveals in a direct 
manner the equivalence of the Poincaré and the Stokes representation of 
an arbitrarily polarised light beam.] 


The mutual interference characteristics of two incompletely coherent 
polarised beams which have been derived by the splitting of an incomplete- 
ly polarised beam can no doubt be described using only the extreme concepts 
of coherence and incoherence. For example, at the end of §7 we shall show 
that when unpolarised light is split into two non-orthogonally polarised 
elliptic vibrations, the partially coherent components obtained will behave 
as if a certain independent fraction f of the intensity of one beam were com- 
pletely coherent with the whole of the second beam—having a definite phase 
advance over it; this result is obtained by regarding the original unpolarised 
beam as the sum of two incoherent beams in specific orthogonally polarised 
states. Similarly, if we consider the example of partial coherence quoted 
in the previous paragraph, the partially polarised pencil (falling on the 
calcite rhomb) may be regarded as composed of a polarised and an unpolar- 
ised portion which are incoherent: the calcite splits the former into two 
coherent and the latter into two incoherent orthogonal vibrations. Thus, 
the two orthogonally polarised pencils emerging from the calcite will behave 
as if an independent fraction f, of one pencil were completely coherent with 
an independent fraction f, of the second (having a definite phase advance 
over it), the remaining fractions being incoherent with one another. An 
unsatisfactory feature of such methods of analysis is that the result depends 
essentially on regarding the given incompletely polarised beam as the sum 
of two other incoherent beams (whose decomposition characteristics are 
known in terms of the ideas of coherence and incoherence alone). For 
example, we would have arrived at a completely different picture of the state 
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of affairs existing between the partially coherent pencils emerging from the 
calcite rhomb, if we had regarded the incident partially polarised beam as 
the sum of two incoherent orthogonally polarised pencils of different inten- 
sity. A second unsatisfactory feature is that though we may be sure that 
these different pictures of the same state of affairs must lead to identical 
results, the reason why in fact they do so—i.e., the invariant character under- 
lying these and other possible representations of two partially coherent beams 
—becomes veiled in obscurity. 


In order to effect a deeper analysis of the problem it becomes necessary 
to have (a) a method of defining directly the state of partial polarisation of 
any beam—without having to regard it as the sum of two other beams; 
and (b) a method of defining directly the mutual parameters (the degree 
of coherence and the effective phase difference) which will determine the 
consequence of the superposition of any two polarised beams—without 
having to regard each beam as the sum of several independent fractions, 
These two problems will accordingly be discussed in the two subsequent 
articles. 


§2. THE REPRESENTATION OF PARTIALLY POLARISED LIGHT 


Since, as has already been remarked, the addition of two partially 
coherent beams in different states of elliptic polarisation results in a beam 
which is necessarily incompletely polarised, we shall in this section digress 
on a method of extending the Poincaré method so that it may also be used 
to represent the state of a partially polarised beam. This extension (and its 
relation to the representation introduced by Stokes) has already been indi- 
cated by Fano (1949) and discussed in more detail by Ramachandran (1952) 
—but we shall introduce it in a somewhat different fashion which is more 
suited to our present requirement. The present section also constitutes in 
itself a presentation of the subject of the Stokes parameters of partially 
polarised radiation by a new procedure—through the Poincaré represen- 
tation itself. The conventional presentation of the subject of Stokes para- 
meters may be found in Chandrasekhar (1949) and Rayleigh (1902). 


Hitherto (see Fig. 1) we have represented the state of polarisation of 
an elliptic vibration by a corresponding point P on a Poincaré sphere of unit 
radius whose centre is O. (The longitude 2A gives the azimuth A of the 
major axis, and the latitude 2w gives the ellipticity tan w.) Instead, if we 
draw in the direction OP a vector s whose length s is made equal to the 
intensity i of the elliptic vibration, then this vector represents not only the 
state of polarisation but also the intensity of the elliptic vibration, We 
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shall refer to s as the ‘ Stokes vector’ defining the ideal elliptic vibration 
(see next paragraph). 


We shall now describe in an explicit manner the picture—assumed 
implicitly in the usual presentation of Stokes parameters—of the vibration 
in an actual beam of sensibly monochromatic radiation. It is known that 
the existence of incompletely polarised beams and of beams not coherent 
with one another can be reconciled with the wave theory of light when the 
extremely short period of light vibrations is taken into account. The 
phenomena depending on the interference of light merely show that for a 
duration very long compared with the period of the light wave, the vibration 
of the light cannot depart sensibly from an ideal periodic vibration de- 
scribed in two dimensions—i.e., an elliptic vibration constant in form, in- 
tensity and absolute phase. During this interval the vibration can be charac- 
terised by a definite temporary ‘ Stokes vector’ s. In a light beam of the 
most general type we can conceive, with constant macroscopic properties, 
the vector s which specifies the temporary intensity and polarisation may yet 
fluctuate millions of times a second. The optical characteristics of the beam 
observed in usual experiments depend only on certain average quantities. 
These, we find, are the intensity I (which is the average of the temporary 
intensity i) and a vector S which is the average of the temporary ‘ Stokes 
vector’ s. Thus 


I=Ki >; 8=< 8) (1) 


where the bent brackets denote ‘ the average value of’. The vector S may be 
called the three-component part of the Stokes vector of the actual light beam, 
but we shall merely refer to it as the Stokes vector. The Stokes vector may be 
specified by its components with respect to any co-ordinate system with origin 
at the centre of the sphere. In the particular case when we choose a right- 
handed co-ordinate system OX,Y,Z, with the XY plane coinciding with the 
equatorial plane, the components of S will be denoted by Q, U, V. The 
four quantities I, Q, U, V will be called the Stokes parameters of the beam 
(with reference to co-ordinate axes on the wave-front of the beam given by 
X, and X,’). In presenting the subject in this fashion we are anticipating the 
fact (to be proved in § 5) that the parameters defined in this geometrical 
manner in the Poincaré representation are identical with those introduced 
by Stokes analytically in an entirely different manner. 


In the special case of a completely polarised beam there are no fluctua- 
tions in the temporary polarisation but only in the temporary intensity /; 
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hence the temporary Stokes vector s does not fluctuate in direction but only 
in its length 7. In this case we have obviously I = S, or 


1=Q:+U2+V? (2a) 


When there are also fluctuations in the temporary polarisation (i.e., in 
the direction of s) the beam is partially polarised (or, in special cases, un- 
polarised). In such cases the specification of the intensity I in addition to 
the Stokes vector S is no longer redundant. If f; denotes the fraction of the 
time for which the vibration is in the state s; then the Stokes vector S is by 
definition the sum of the vectors fjs;. Now it is an obvious geometrical 
fact that the resultant of a number of vectors not all in the same line must 
have a length S which is less than the sum of the lengths /;i; of the individual 
vectors. Hence for any beam not completely polarised, I > S or 


I> Q?+ U*+ V? (2 5) 


One may compare the simplicity of the above proof with that used in the 
usual treatment of Stokes parameters (Chandrasekhar, Rayleigh, Joc. cit.). 


We shall now show that I and S together completely determine the 
appearance presented when the beam is passed through any transparent 
double refracting crystal followed by a linear analyser—i.e., when the beam 
is passed through any elliptic analyser which transmits completely light of 
some particular polarisation C (see Part I, §8). The state of polarisation 
transmitted by the analyser, instead of being specified by the point C on the 
Poincare sphere may equally well be specified by a unit vector C drawn from 
the centre of the sphere to the point C. The C-component of the beam will 
then have a temporary intensity i, which, according to a fundamental pro- 
perty of the Poincaré sphere, is equal to i cos? 4 6, where 0 is the angle between 
C and s(see Part I,§2). Since i= s, we have 


ic =43(i+ C's) 


where s is the temporary Stokes vector of length equal to the temporary 
intensity i of the beam. The intensity transmitted by an analyser C 
is obtained by averaging as: 


I, =3(1+ C.S) (3) 
and is hence determined by I and S. This expression will also be of use 
later. 


For unpolarised light we should expect [from our definition of the 
Stokes vector in (1)] that the Stokes vector should become a zero vector 
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coinciding with the centre of the sphere. From (3) we see that this is 
necessarily so since unpolarised light has the property that any elliptic 
analyser C transmits always half the intensity of the beam. (See also 
Hurwitz, 1945.) 


One of the most important properties of the Stokes parameters is the 
following. When a number of incoherent beams are mixed, the Stokes vector 
S of the resultant beam is the vectorial sum of the individual Stokes vectors S; 
of the separate beams (the intensity I of the resultant beam being naturally 
the sum of the individual intensities Ij). This readily follows from the fact 
that the total intensity I, transmitted by any analyser should be the sum of 
the transmitted portions of the separate beams—this being the experimental 
test of their incoherence. Or, 


I. = 4 (21; + C.2S§;) 


Since I, is also given by (3) the required result is obtained. We may also 
express the result by the statement that when a number of incoherent beams 
are combined, each Stokes parameter of the resultant beam is the sum of the 
corresponding Stokes parameters of the individual beams. 


A particular consequence of the above result is that any partially 
polarised beam (I, S) may be looked upon as an incoherent combination of 
a completely polarised beam (S, S) and an unpolarised beam (I-S, O). This 
gives a second physical interpretation of the Stokes vector S of a partially 
polarised beam: it is a vector whose length is equal to the intensity of the 
polarised portion of the beam, and whose orientation (i.e., point of intersection 
with the Poincare sphere) gives the state of polarisation of this polarised 
portion. 


We shall not require any further properties of the Stokes representation 
than have been derived above. 


§3. THe DEGREE OF COHERENCE AND THE EFFECTIVE PHASE 
DIFFERENCE BETWEEN TWO POLARISED BEAMS 


We have already noted that in any beam which (for practical purposes) 
is completely polarised and monochromatic, the form of the elliptic vibration 
remains constant in time but the temporary intensity fluctuates. In order to 
explain the fact that it is possible for two polarised beams to be completely 
incoherent with one another, we must also add that the absolute phase of 
the elliptic vibration though remaining sensibly constant over successive 
durations very long compared with the period of light, also fluctuates very 
rapidly from a macroscopi¢ standpoint (Stokes, 1852). Thus, any two beams 
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of polarisation A and B travelling along the same direction may be charac- 
terised not only by temporary intensities i, and i,, but also by the temporary , 
phase advance 6; which the vibration in one beam A has over that in the : 
other. The phase difference between two ideal elliptic vibrations not in the 
same state of polarisation has already been defined in Part I, §§ 3 and 7. (The 
two vibrations are said to have zero phase difference, when the state of 
polarisation C obtained by their composition is represented by a point which 
lies on the great circular arc joining the points representing the states A and B 
on the Poincare sphere ; for oppositely polarised vibrations a special arc AYB 
is chosen as the great circular arc of zero phase.) 


We shall find that the observable characteristics (I, S) of the beam 
obtained on compounding the two beams depend on the following average 
quantities correlating the fluctuations in the two beams. These may be called 
the effective phase advance 6 of one beam A over the other, and their mutual 
degree of coherence y (defined to be a positive quantity). These are defined— 
in the same manner as is done in ordinary diffraction theory (Zernike, 1938), 
where a scalar wave theory of light is used—by the relation: 


C Vitis > = VET ye (4) 
where the sharp brackets are used to indicate the average value. 


The above relation (in which i represents +/ — 1) is equivalent to the two 
relations 


2 VII, y cos § = 2< v/i,i, cos 8: = U’, say, (5) 
2 Vil, y sin § = 2¢ v/i,i, sin 54> = V’, say, (6) 

so that 

1 eee 
=_ i 72 7 
we (7) 
tan § = - 

From (4) we see that 6 has the properties of a phase difference: if the 
instantaneous phase difference is altered by a constant amount, the effective 
phase difference alters by the same value; while, if the instantaneous 


intensities are respectively multiplied by constant factors, the effective phase 
difference is unaltered. 


Regarding the degree of coherence y (defined to be a positive quantity) 
it may be shown that it lies between the limits zero and unity. The proof is 
given in Linfoot (1955); it may also be obtained by representing the 
momentary ‘ mutual intensity’ +/i,i, exp id; as a vector in an Argand diagram, 
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applying the argument used in proving (25), and using the inequality 
Vil. >< Vii, >. The beams will be said to be coherent when y=1, 
which occurs if there is complete correlation between the fluctuations in the 
two beams—the temporary phase difference as well as the ratio of the tempo- 
rary intensities being absolutely constant in time. On the other hand when 
y = 0 the beams will be said to be incoherent. 


Instead of y and 5, we shall sometimes find it more convenient to use 
the correlation parameters U’ and V’ defined in (5) and (6); or alternatively 
a single complex quantity I,, which has been termed the mutual intensity 
(Zernike, 1938). 


Le = Vik y es = 4(U’ + iv) 
§ 4. INTERFERENCE OF THE COMPONENTS OF TWO PARTIALLY 
COHERENT BEAMS TRANSMITTED BY AN ANALYSER 


It is possible to experimentally determine the degree of coherence y 
and the effective phase difference 5 between two polarised beams of intensities 
I, and I,, in states of polarisation A and B respectively. This can be done 
by observing the interference effects after resolving them to the same state of 
vibration by the use of an elliptic analyser which transmits light of polarisa- 
tion C. Representing the states of polarisation A, B and C by corresponding 
points on the Poincaré sphere (see Part I, Fig. 1), the instantaneous intensity 
i, transmitted by an analyser C can be expressed in terms of the sides a, b, c 
of the spherical triangle ABC, and its area E. According to the results of 
§ 8, VIL of Part I of this paper, the instantaneous intensities of the resolved 
components transmitted by the analyser will be i, cos?45 and i,cos?4a 
respectively, their instantaneous phase difference being (5; — 4E). 

Hence 


ic = i, cos? $b + i, cos? 4.a + 2v/i,i, cos (85g — 4 E) cos4$acos}$b 


(It is to be remembered that this expression holds also in the limiting case 
when A and B represent orthogonal states of polarisation—see Part I, § 8.) 
The intensity I, transmitted by the analyser C is obtained by taking the 
average of the above expression using (4). 


I, = I, cos? 4b + I, cos* 4a + 2yv/LI, cos $acos$bcos(8 —4E) (8) 
This expression will be of much use later. 


If we denote the intensities of the resolved components of the two beams 
by I,’ and I,’ the above result may be written: 


Ic = Ly’ + Ly + 2yv/I,,' cos (8 — 4 E) (8’) 
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The interference effects will be most pronounced when the intensities of the 
resolved components (transmitted by the analyser) are equal in magnitude— 
this being secured by using any analyser C for which I, cos? 4 b = I, cos? $a, 
Under these conditions it can be easily shown that the visibility of fringes 
(as defined by Michelson) is equal to the degree of coherence y. To experi- 
mentally determine 5 we choose an analyser C for which the area E is zero, 
so that the point C lies on the great circular arc of zero phase joining A and B 
(see Part I, §§4 and 7). The effective phase advance 8 of one beam over the 
other can then be measured by the amount by which its path must be retarded 
in order that the intensity resulting from the interference of the resolved com- 
ponents becomes a maximum. 


From the expression (8) for the intensity transmitted by an analyser C 
when two partially coherent polarised beams are incident on it, we may 
deduce the following theorem. 


Suppose a number of independent streams of intensities @,, dy... .dy 
all in the state of polarisation A are combined with a number of independent 
streams of intensities b,, b,....b, all in the state of polarisation B. Let y; 
and 5; denote the degree of coherence and the effective phase relation between 
the corresponding pairs of beams a; and b;. Then the degree of coherence 
and the effective phase advance of the resultant beam of polarisation A over 
that of polarisation B will be given by 


Vi y e8 = 2 Va, v5 et (9) 


The above result follows from the fact that the intensity I, transmitted by any 
analyser C given by (8), will also be the sum of the intensities (I,);, where (I); 
denotes the intensity transmitted due to the pair of beams a; and b;—this 
result being true for any value of E. The result may also be expressed by 
the statement that the mutual intensity between the resultant beams is the sum 
of the mutual intensities of the individual pairs. 


As a particular case of the above theorem we note that if an independent 
fraction f, of the intensity of one beam is completely coherent with an inde- 
pendent fraction f, of the intensity of the second, having a phase advance 6 
over it, the remaining portions of the two beams being incoherent, then 4 
is also the effective phase advance of the first beam over the second, while 
Vf,f2 is their mutual degree of coherence. From the above result we obtain 
a still simpler method of regarding any two partially coherent beams; an 
independent fraction y* of the intensity of one beam may be regarded as 
coherent with the whole of the second beam, having a phase advance 5 over it. 
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§ 5. ADDITION OF TWO PARTIALLY COHERENT BEAMS 
OF OPPOSITE POLARISATION 


Let i, and i, be the temporary intensities of two vibrations in the oppo- 
site states of polarisation X and X’, and let 5; be the temporary phase advance 
of the first vibration over the second (Part I, §7). Let s be the immediate 
value of the Stokes vector of the resultant vibration obtained, whose state 
of polarisation is represented by the point P (see Fig. 1). 





Fic. 1. 
P- Momentary state of polarisation of partially polarised beam. 
s- Momentary Stokes vector of length equal to the temporary intensity. 
§,-Instantaneous phase difference between the X and X’ components. 


We shall refer the temporary Stokes vector s to a right-handed system 
of co-ordinate axes OX, OY, OZ chosen such that the point Y represents the 
vibration whose X and X’ components are defined to be in the same phase 
(Part I, §6). A fundamental property of the Poincare sphere (Part I, § 2, I) 
is that when a vibration of intensity s in the state of polarisation P is resolved 
into two oppositely polarised vibrations in the states of polarisation X and X’, 
the intensities of these components (which are equal to i, and i.) are given by: 


i, = scos?4PX; i, = ssin? 4 PX 
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i, — ig = s COS PX; 2V/i,i2 = S sin PX. 
But s cos PX is the projection of the temporary Stokes vector in the OX 
direction; and s sin PX is the projection of the temporary Stokes vector on 


the YZ plane. Now according to Part I, § 6, VI, we have PXY = dt. Hence 
the temporary intensity i and the temporary Stokes vector of the resultant 
vibration are given by: 
i =iith > Sx = (4 — 4) ) 
~ wai (10) 
Sy = 2 Visi, COS 8g; Sz = 2 Visi, Sin 5¢ j 
We now take the average of the above expressions using (1), (5) and (6). 
We thus see that on compounding two oppositely polarised beams of intensities 
I, and I, (for which the mutual degree of coherence and the effective phase 
advance of the first beam over the second are y and 8 respectively), the resultant 
partially polarised beam (I, S) is given by 


l =],,+ I ; &=ih,—] 
1 2 x 1 2 \ (11) 


Sy = 2y VII, cos 8; Sz = 2y VII, sin 5 


It may be noted that S, and S, are equal to the correlation parameters 
U’ and V’ respectively. 


We shall now consider the converse problem, viz., of resolving a given 
beam (I, S) into two oppositely polarised beams X and X’—a decomposition 
which occurs when the beam falls on any transparent crystalline plate. Obvi- 
ously when the instantaneous vibration s of this beam is resolved into vibra- 
tions in the states of polarisation X and X’, these component vibrations will 
have intensities i, and i,, and a phase relation 5; which satisfy (10). Since 
the momentary vibration s fluctuates for an incompletely polarised beam, 
it is clear that i,, i, and 8; will also fluctuate—so that the component beams 
will, in general, be partially coherent. The intensities I, and I, of the compo- 
nent beams, their degree of coherence y and the effective phase advance 6 
of the first beam over the second can be obtained from (11). 


I, =4 (1+ Sz) ; I, = $(I — Sz) 
l saat A (12) 
7 = + Twit * S/Sy*? + S22; tand = S, 
1*2 
It may be noted that the expression for the effective phase difference 6 


between the component beams does not involve the degree of polarisation 
of the given beam. 
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The first Stokes parameter of a beam is its intensity. We may now easily 
see that our method of defining the remaining Stokes parameters (as compo- 
nents of the average Stokes vector S with respect to a special co-ordinate 
system) is entirely equivalent to the usual method. To show this we consi- 
der the resolution of an arbitrarily polarised beam into two orthogonal 
linearly polarised beams. In this case the states X and X’ in Fig. 1 lie on 
the equator. The four Stokes parameters of the beam (with respect to axes 
on the wave-front given by X and X’) are then customarily defined as the 
average values of the expressions on the right-hand side of each of the four 
relations in (10). In our presentation (see § 2), the average values assumed 
by the quantities i, sy, Sy, Sz, in the particular case when the XY plane of the 
co-ordinate system lies on the equatorial plane, are the Stokes parameters of 
the beam (with reference to axes on the wavefront given by X and X’). The 
relations (10) show that both methods of definition are equivalent. Our 
method of introducing the Stokes parameters (given in § 2) is more general, 
in that it does not at all involve the representation of an arbitrarily polarised 
beam as the sum of two other (partially coherent) linearly polarised beams. 
In fact such a decomposition is clearly seen to be merely a particular case 
of the problem which we shall discuss in § 7, viz., the decomposition of any 
partially polarised beam into two beams in non-orthogonal states of polar- 
isation. 


§ 6. ADDITION OF Two PARTIALLY COHERENT BEAMS 
IN NON-ORTHOGONAL STATES OF POLARISATION 


When any two beams travelling along the same direction are combined, 
the instantaneous intensity of the resultant beam is obtained from Part I, 
§ 3, IIL as 
y~> 


i=i,+ i,+ 2 vVi,i, cos $c cos 5¢ 


where the similarity factor cos? 4c will be absolutely constant in time if the 
two beams are completely polarised. Averaging the above equation using 
(5) we obtain the generalized formula for the interference of two polarised 
beams of intensities I, and I,, degree of coherence y and effective phase differ- 
ence 8: 

I=1,+ 1, + 2y V/1,I, cos $ c cos 5 (13) 


Thus y and 5 may be determined by direct interference experiments, though 
the method of using an analyser given in § 4, is to be preferred—to increase 
the visibility of interference effects. In relation (13), c is the angular sepa- 
ration between the states of polarisation on the Poincaré sphere. The above 
Ad 
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relation can also be easily obtained by regarding a fraction y* of the firs; 
beam as coherent with the second beam, having a phase advance 6 over it, 


It remains to find the Stokes vector of the resultant beam. Let the 
states of polarisation of the two beams be given by the points A and B on 
the Poincaré sphere, or alternatively by unit vectors A and B joining the 
centre of the sphere to the points in question. The Stokes vectors of the 
two beams are then S,;=1,A and S,=1,B. When the beams are not 
completely coherent it is clear that there will be fluctuations in the temporary 
state of polarisation and intensity of the resultant vibration—so that the 
resultant beam will be incompletely polarised. We shall in this section 
prove that the Stokes vector S of the resultant beam may be obtained by the 
following procedure (see Fig. 2). It is obtained by adding to the sum of the 
given Stokes vector S, and S, (directed towards points A and B), a third 
vector Sy, (directed towards a point C”). This last vector which arises be- 
cause of the interference of the beams, may be specified in terms of the angles 
of the triangle ABC” which is isoscles: the base angles A and B are both 
equal to the effective phase difference 5 between the beams, and the length of 
vector Sy. is 2y VI,1, sin $C’. 

The components of the Stokes vector of the resultant beam may be 
found by using the following proposition which follows from (3): the compo- 
nent of the Stokes vector along any direction C is equal to the intensity trans- 
mitted by an analyser C, minus the intensity transmitted by the orthogonal 
analyser (— C). 






























iz. 


I, = Ll. = S.C (14) 


Since we have already (in eq. 8) derived the intensity transmitted by any 
analyser when it is introduced in the path of two partially coherent polarised 
beams, we may find the Stokes vector S of the resultant beam. (It may be 
remembered that in equation 8, the quantities a, b, c are the sides of the 
spherical triangle ABC, while E is the area of the triangle—measured with 
the usual sign convention.) The intensity I_, transmitted by the orthogonal 
analyser (—C) may be obtained from (8) by changing a and d to their supple- 
ments, and E to E’, where E’ is the area of the triangle ABC’ columnar to 
ABC. We then have 


I. —-L,.=I,cosb + I, cosa + 2y VL I, 
x {cos }acos 4 bcos (5 — $E) — sin} asin $ bcos (5 — $ E’)} 
Since the two beams which are being combined are completely polarised, 
we have according to (2 a), 1, = S,; and I,= S,, where S, and S, are the Stokes 
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vectors of these beams. The first two terms of the above expression are 
accordingly equal to S,.C and S,.C respectively. Comparing with (14) 
we see that the Stokes vector S of the resultant beam may be written as 


S=S, +S, + Si (15) 


where the term S,. which may be considered as arising from the mutual 
interference of the beams, may be determined from the relation 


S...C = 2y V1, {cos 4 acos 4 bcos (6 — 4 E) 
— sintasin 4 bcos (6 — $E)} (16) 
It may be noted that S,, is y times the value which it would have if the beams 
were completely coherent. 


Since the last relation gives the component of S,,. along any direction 
C, we may determine the vector S,. by finding its components with respect 
to the special co-ordinate system OX, OY, OZ given in Fig. 2. The positive 
x-axis is taken along the direction of the vector (A — B); the positive y-axis 
along the direction of the vector (A + B); and the positive z-axis along the 
direction of the vector (AB). (The definitions of the y and z directions 
would have to be slightly modified if we wish also to cover the limiting case, 
discussed in the previous section, when A and B are oppositely polarised.) 





Fic, 2 


Composition of non-orthogonally polarised teams S, and S,. The vector for the resultant 
Partially polarised beam is the sum of the three vectors drawn in the Figure; Sj. has an orientation 


determined by A = B = 6, and a length equal to 2y V il, sin4 C’. 
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When the unit vector C is taken along the x direction (see Fig. 2) we 
will have in (16), cos $ acos $b = sin} asin 4b and E = E’ = 0. When the 
unit vector C coincides with the y-axis we have E = 0, E’ = wanda = b = te, 
Lastly when the direction of C coincides with OZ we will have a = b = z/2 
and E=—E’=c. Making these substitutions in (16) we obtain the 
components of S,,. as 


(Si2)x =0 
\ (17) 


(Sis)y = 2y VII, cos 8; (Sy2)z = 2y VW LI, sin 5 sin $c 
We see that the vector S;, lies in the yz plane, and hence that the triangle 


ABC” is isoscles. Also the inclination of this vector to the y-axis is equal 
to the arc YC” and is given by 


“N 
tan YC” = S,/S, = tan d sin} c 


Since the spherical triangle AYC” is right angled at Y, it is seen from 
spherical trigonometry that the above relation implies that the angle at A is 
equal to 5. This locates the position of C or the orientation of the vector 
S,.. Its length is given by 


Sie = VSy? + S2 = 2y VU. V(1 — sin? 5 cos? $c) 
= 2y VII, sin $C’ 
These are the relations for determining the vector S,. which were stated at 
the beginning of this section, and which we wished to prove. The super- 


position of oppositely polarised beams, discussed in the previous section, 
may be considered as a limiting case of the present discussion. 


The result may also be expressed concisely in vector notation, the result 
being then independent of any co-ordinate system. Using the correlation 
parameters U’ and V’ introduced in (5) and (6) instead of y and 4, we can 
then write (17) as 


S12 = $sec $c {U’ (A + B) + V’ (AxB)} (18) 
or 

S,2 = Real part of 1,. 84,’ sec $c (19) 
where 

Si.’ = (A + B) — i(AxB) 
Hence when two non-orthogonally polarised beams with Stokes vectors 
S, and S, are combined, the Stokes vectors S of the resultant beam is given 


by (15) where Sy, is given by (18), or (19). The intensity I of the resultant 
beam is given by (13). 
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The components of the Stokes vector of the resultant beam (with respect 
o the special co-ordinate axes chosen) may now be written down from (15) 
nd (17) by noting that the x-components of S, and S, will be I, sin 4c and 
—I,sin4}c respectively, while the y-components will be I,cos$c and 
,cos $c respectively. Hence the Stokes vector of the resultant beam is 
iven by: 


Sx = (I, — I,) sin 4 Cc 
Sy = (I, + I,) cos $c + 2y VII, cos 5 (20) 
Sz = 2y VII, sin 6 sin} c 


§ 7. DECOMPOSITION OF ANY BEAM INTO TWO NON-ORTHOGONALLY 
POLARISED PENCILS 


We now consider a problem which is the converse of that treated in the 
previous section, viz., the resolution of a given arbitrarily polarised beam 
(I,S) into two non-orthogonally polarised beams in given states of polar- 
isation A and B. Such a process occurs for example when the beam falls 
on a plate of an absorbing biaxial crystal. 


If S denotes the instantaneous Stokes vector of the given beam, then 
according to the results proved in Part I, §4, IV, the temporary intensities 


i, and i, of the component beams will be given by 
cad , 

;, — ; med. _ _ , Sin? 

: sin?tc’? ? sin? 4¢ 


where 0, and 0, are the the (momentary) angles that the vector s makes with 
the vectors A and B respectively. 


Now, since i= s, this may be re-written thus: 
i, = 4(i — s.B) cosec? $c; i, = $(i — s-A) cosec? 4c 


The average intensities of the non-orthogonally polarised component beams 
will therefore be 


I, = 4(1 —S.B) cosec? $c; I, =4(1—S.A) cosec?4¢ (21) 


It remains to determine the degree of coherence and the effective phase 
difference between the component beams, or alternatively the correlation 
parameters U’ and V’ defined in (5) and (6). The first parameter is obtained 
by eliminating (I, + I,) from the expression for S, in (20) by using (13). We 
thus obtain 
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U’ = (Sy — I cos $c) cosec? 4.¢ (22-a) 
= 4{S.(A + B) — 21 cos? 4c} cosec? $.csectc (22-b) 
The parameter V’ is given by the last relation in (17) 


V’ = Sz cosec hc (23-a) 


= 4S8.(AxB) cosec? 4.c sec 4c (23-b) 
or 


2 Ihe = 4 (S-Say’— 2 I cos? $c} cosec? $c sechc 


As a particular example of much interest we may consider the de- 
composition of unpolarised light into two non-orthogonally polarised beams. 
Since for unpolarised light S = 0, the intensities I, and I, of the component 
beams are, according to (21), both equal to 4I cosec?4c. The effective 
phase difference and the degree of coherence could be determined by finding 
the parameters U’ and V’ from (22) and (23). But it is more instructive 
to go back to relation (15) from which it may be noted that for S to be a zero 
vector, S,. must be coplanar with S, and S,. Since S,, must also be in the 
yz plane, it is clear that the point C” towards which it points must be the 
mid-point of the greater segment of the great circular arc through A and B. 
The length of the vector S,,. being equal to that of (S, + S,) will be given by 
(I, + I,)cos4c. The effective phase difference between the component 
beams is 7 (being equal to the angle A of the isoscles triangle ABC”). The 
degree of coherence between the two beams is cos}c (since the length of 
the vector S,, is also given by 2y /I,I, sin 7/2). 


A 
#4 
| 








Fic. 3. 


Decomposition of unpolarised light into two non-orthogonally polarised beams with vibra- 
tion-directions along A and B. The resultant beams have a degree of coherence cos $c and an 
effective phase difference of 7. 


The decomposition of unpolarised light into two non-orthogonal 
vibrations in the states of polarisation A and B (separated by an angle c 
on the Poincaré sphere) may be more simply analysed by replacing the un- 
polarised light by two incoherent beams each of intensity $I in the ortho- 
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gonally polarised states A and A’. (In order that the general method may be 
perfectly clear, the particular case when A and B are linear vibrations in- 
clined at the angle 4 c is drawn in Fig. 3 and may be followed simultaneously.) 
The beam of polarisation A’ may now be replaced by two coherent beams 
in the non-orthogonal states of polarisation B and A. These latter two 
vibrations will have a phase difference of 7, since A’ lies on the greater seg- 
ment of the great circle through A and B (Part I,§ 4, V). Their intensities 
will be respectively 4 I cosec® 4 c and 41 cot? $c (as is given by the parallelo- 
gram law in the case of Fig. 3, and in the general case by substituting b = 7 
and a = 7 — c in the results of Part I, §4, IV). Thus in the state of polarisa- 
tion B we have a beam of intensity 41 cosec?4c; while in the state of 
polarisation A we have two incoherent vibrations which add to give a beam 
of the same intensity +I cosec?4c. Of the latter beam, however, an inde- 
pendent fraction comprising an intensity $1 cot?4c is completely coherent 
with the other beam and is opposed in phase to it. In other words, the 
degree of coherence between the beams is cos}c and the effective phase 
difference is 7 (according to the result proved at the end of § 4). 


§ 8. THE ADDITION OF n PARTIALLY COHERENT BEAMS 


We shall now consider the addition of n polarised beams whose states of 
polarisation are represented by the points P,. P,....P, on the Poincaré 
sphere, or alternatively by the unit vectors P,. P.. P;....P,, joining the centre 
of the sphere to these points. The instantaneous intensity 7 of the resultant 
beam will be given by equation (14) of Part I, § 9 


i=2iz+ FX ij~ COS} cj, 
iyxk 


where i; denotes the temporary intensity of the jth beam, i;, the temporary 
mutual intensity of the jth beam with respect to the kth and cj, is the angle 
between the vectors P; and Py. Averaging the above equation we obtain 
the following expression for the intensity I of the resultant beam: 


[= 21; + 2 Ijk cos $ Cik (24) 
jFk 


where Ij, is the mutual intensity of the jth beam with respect to the kth. 
The second term in (24) arises from the mutual interference of the different 
pairs of beams. 


The form of (24) suggests that the Stokes vector S of the resultant beam 
may -be obtained by a similar generalisation of relation (15) obtained for two 
beams : 


S=28;+4 2 Sy, (25) 
i%k 
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or alternatively 


S = 28, + 42 1j;~°Sjx’ sec $ cj, 
where (26) 
Six’ = (Pj + Px) — i (Pj X Px) 


That the relations (25 and 26) do indeed give the Stokes vector of the 
resultant beam may be verified by taking recourse to the intensity transmitted 
by any analyser C when the n beams are incident on it. This intensity I, 
will be the average of the instantaneous transmitted intensity [which may be 
obtained by using equation (13) of Part I, § 9]: 


I, = 2]; cos? 6; + SV yj~ V1jlk Cos 8; cos 9, cos (8;~ — 4 Ejx) 


where 26; denotes the angle between P; and C, and Ej, the area of the spherical 
triangle CP;P,. The component of the Stokes vector of the resultant beam 
along any direction C is obtained by writing the value of (I, — I_,). Since 
the Sj), satisfy relations of the type of (16) it may be easily shown that (26) 
gives the Stokes vector of the resultant beam. 


It is a pleasure to acknowledge the encouragement given by Prof. Sir C. V. 
Raman, F.R.S., N.L., and the keen interest he took in this investigation. 


§9. SUMMARY 


The superposition of two partially coherent but completely polarised 
beams is discussed. The formula for the intensity of the resultant beam is 
obtained from the interference formula for coherent beams by multiplying 
the third interference term by the degree of coherence y (defined statistically). 
The states of the two given polarised beams and that of the resultant incom- 
pletely polarised beam may be characterised by respective vectors drawn 
from the centre of the Poincaré sphere: the length of each vector and its 
orientation (i.e., point of intersection with the sphere) may be regarded as 
giving respectively the intensity and state of polarisation, of the polarised 
fraction of the corresponding beam. The vector for the resultant beam is 
obtained by adding to the sum of the two given vectors (which are directed 
towards points A and B), a third vector directed towards a point C” on the 
Poincaré sphere. This last vector which arises because of the interference 
of the beams, is specified in terms of the angles of the triangle ABC”, which 
is isoscles: the base angles A and B are both equal to the effective phase 


difference 5 and the length of the vector is equal to 2y1/I,], sin $C’. 


The converse problem is discussed and also the addition of n partially 
oherent polarised beams. The paper also presents the subject of the Stokes 
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parameters of partially polarised radiation through an extension of the 
Poincaré representation. 
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SYNTHESIS OF CITREOROSEIN AND ALG@-EMODIN 


By T. R. RAJAGOPALAN AND T. R. SESHADRI, F.A.Sc. 
(Department of Chemistry, Delhi University, Delhi) 
Received October 15, 1956 


IN continuation of the earlier work on the synthesis of teloschistin,! the syn- 
thesis of citreorosein has now been accomplished. The starting substance 
is frangula-emodin (I) the synthesis of which has already been recorded by 
Eder and Widmer® and also by Jacobson and Adams.* Its triacetate (III) 
undergoes smooth reaction with N-bromosuccinimide yielding the w-bromo 
derivative (IV) which on treatment with silver acetate and acetic anhydride 
forms the tetra-acetate of citreorosein (V). Hydrolysis with methanolic sul- 
phuric acid produces a good yield of citreorosein (VI) which agrees in every 
respect with the natural samples obtained from (i) Penicillium cyclopium 
Westling by Anslow, Breen and Raistrick* and from (ii) Penicillium citreoroseum 
Dierckx by Posternak and Jacob® and the mixed melting points are 
undepressed. 
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In the course of this study, it has been observed that emodin (I) can be 
conveniently methylated using dimethyl sulphate and potassium carbonate 
to yield physcion (VIII) and the reverse conversion from physcion to emodin 
can be brought about by boiling with hydrobromic acid for a long period. 
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As a piece of exploratory work a simpler case has been examined. Start- 
ng from chrysophanol (II), alee-emodin (VII) has been prepared and the 
method using N-bromosuccinimide works quite satisfactorily and is far more 
convenient as compared with the earlier method of synthesis using rhein 
and effecting reduction to the carbinol in two stages.® 


EXPERIMENTAL PROCEDURE 


Synthesis of citreorosein 


(i) Bromination of emodin triacetate to the w-bromo derivative-—Emodin 
triacetate was prepared by suspending emodin (0-9 g.) in acetic anhydride 
(17 c.c.) containing concentrated sulphuric acid (2 drops) and refluxing the 
mixture for a few minutes. The mixture was allowed to cool to the room 
temperature and after two hours it was poured on crushed ice and the yellow 
solid separated was filtered, washed with water and dried. It was crystallised 
from ethyl acetate yielding long yellow needles melting at 196-98°C. Yield, 
0-85 g. 


To a solution of emodin triacetate (0-5 g.) in dry carbon tetrachloride 
(150 c.c.) were added freshly crystallised and dried N-bromosuccinimide 
(0:-4g.) and benzoyl peroxide (0-02 g.) and the mixture refluxed on a water- 
bath for 24 hours. The solution was then filtered hot to remove the separated 
succinimide and the filtrate cooled in ice, when a pale yellow crystalline solid 
separated. It was filtered and washed with cold water and then with boiling 
water to remove succinimide and unreacted bromosuccinimide. The solid 
was dried in a vacuum desiccator and crystallised thrice from dry ethyl acetate 
yielding w-bromo-emodin triacetate as pale yellow long needles, melting at 
232-34° C. Yield, 0-35g. (Found: C, 53-2; H, 3:4; Br, 16°8%. 
C.,H,,;O,Br requires C, 53-1; H, 3-2; Br, 16-9°), 
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(ii) Conversion of the bromo derivative into citreorosein tetra-acetate.— 
The above bromo compound (0-2 g.) was suspended in acetic anhydride 
(15 c.c.) and silver acetate (0-6 g.) added and the mixture refluxed for 6 hours. 
It was then poured on crushed ice and stirred. The brownish yellow solid 
was filtered, washed well with water and dried. The solid was repeatedly 
boiled with benzene (5 « 25 c.c.) and filtered to leave behind the silver bro- 
mide. The filtrate on evaporation yielded a pale yellow solid, which was 
crystallised from dry benzene to yield pale yellow needles of the tetra-acetate 
of citreorosein melting at 190-91°. Yield, 0-15 g. (Anslow ef al., 190-91°), 


(iii) Hydrolysis of the acetate to citreorosein—The acetate (0:1 g.) was 
suspended in methanol (50 c.c.), concentrated sulphuric acid (2 c¢.c.) added 
carefully and the mixture refluxed on a water-bath for an hour. Methanol 
was then removed under reduced pressure and the solution poured on crushed 
ice. The gelatinous precipitate was coagulated by boiling for an hour, and 
the orange solid was filtered. It was crystallised from methanol (norite) to 
yield dull orange needles of citreorosein, melting at 288-89°. Yield, 0-07 g, 
(Anslow et al., 288°). Mixed melting points with the natural samples of citreo- 
rosein kindly supplied by Professor H. Raistrick and Professor T. Posternak 
were undepressed. (Found: C, 62:6; H, 3-8%. (C,;H1)O, requires C, 62:9; 
BR. 33) 


It was insoluble in cold 2% aqueous sodium bicarbonate but dissolved in 
aqueous sodium carbonate and sodium hydroxide giving red solution in each 
case. It gave a reddish orange colour with concentrated sulphuric acid and 
reddish brown colour with alcoholic ferric chloride. 


Demethylation of Physcion 


To a suspension of physcion (1 g.) in glacial acetic acid (150 c.c.) was 
added hydrobromic acid (d.1.8; 120c.c.) and the mixture was refluxed for 
12 hours at the end of which the solution became clear. Glacial acetic acid 
was distilled off under reduced pressure and the residue poured on crushed 
ice. The separated brown solid was filtered and dissolved in aqueous sodium 
carbonate (5°%, 200 c.c.). The carbonate solution was filtered from suspended 
impurities and acidified with ice-cold dilute hydrochloric acid. The precipi- 
tate was filtered and on crystallisation from dilute alcohol, it yielded emodin 
as orange red needles, melting at 253-55°. Yield, 0:9 g. Mixed melting point 
with an authentic sample of emodin was undepressed. 


Partial methylation of emodin to physcion 


To a solution of emodin (1 g.) in dry acetone (100 c.c.) were added di- 
methyl sulphate (0-4c¢.c.; 1-1 moles) and dry potassium carbonate (3 g.). 
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The mixture which was deep red in colour was refluxed for 4 hours, filtered 
off and the potassium salts repeatedly washed with hot acetone. The solvent 
was distilled off from the filtrate and the residue was dissolved in chloroform. 
The chloroform solution was extracted with aqueous sodium carbonate (5%; 
50. c.c.) to remove the unreacted emodin (0-3 g.).. After washing the chloro- 
form layer with water, it was dried over anhydrous sodium sulphate and then 
distilled, when an orange-red solid was left behind, which crystallised from 
glacial acetic acid as orange-yellow rectangular plates melting at 206-07°. 
Yield, 0-5 g. It was insoluble in aqueous sodium carbonate (5%) but dissolved 
easily in sodium hydroxide and potassium hydroxide producing purple red 
solutions from which pink precipitates separated soon. With alcoholic ferric 
chloride, it gave a reddish brown colour. Mixed melting point with 
an authentic sample of physcion was undepressed. 


Synthesis of ale-emodin 


Chrysophanol diacetate (60 mg.), carbon tetrachloride (15 c.c.), N-bromo- 
succinimide (50 mg.) and benzoyl peroxide (20 mg.) were employed and the 
reaction carried out as described in the case of emodin triacetate. The crude 
bromo compound, melting at 212-15° with earlier sintering, was directly 
used for further stages. It was refluxed with silver acetate (50 mg.) and acetic 
anhydride (3 c.c.) for 6 hours. On working up and crystallising the product 
from benzene, alce-emodin triacetate was obtained as yellow needles melting 
at 175-77°. Hydrolysis with methanolic sulphuric acid yielded alce-emodin 
as pale brown orange needles, m.p. 221—23°, undepressed by admixture with 
the natural sample of alc-emodin, kindly supplied by Prof. Shibata (m.p. 
222-24°). 

SUMMARY 


Starting from chrysophanol and frangula-emodin the corresponding 
w-hydroxy compounds, alee-emodin and citreorosein have been prepared 
adopting the N-bromosuccinimide method. Partial methylation of frangula- 
emodin to physcion is effected most conveniently by the methyl sulphate- 
potassium carbonate method. Physcion can be demethylated directly to 
frangula-emodin by long boiling with hydrobromic acid. 
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1. THe PHysICAL BACKGROUND 


RAMAKRISHNAN AND MATHEWS (1953, 1955) have recently treated the energy 
loss of fast charged particles passing through matter in one dimension, as 
a stochastic process described by an integro-differential equation. They set 
out to determine the probability 7(E|E,; 7) dE that the (kinetic) energy E 
of a particle, which hits an absorber with the initial energy Eo, will be found 
in the interval between E and E + dE when the particle reaches a depth 1. 
The interaction between the particle and the absorber determines a function 
R (E'|E) dE’ which gives the probability that the particle of energy 
E will lose the energy E — E’ while traversing a unit thickness of the 
absorber. Since the particle does not gain energy on its passage through 
matter, R(E’|E) can be different from zero only when E> E’. (It would 
seem that in an absorber with variable density this function should depend 
also on the depth ¢t. However, it is possible in that case to redefine ¢ so that 
it does not enter R(E’|E), since the fundamental meaning of the variable ¢ 
is the number of atoms which the particle passes by.) 


In the present work we shall assume that the energy transferred from the 
particle to the absorber by collisions is very small as compared with its energy 
loss due to Bremsstrahlung. R(E’|E) dE’ dt then gives the probability that 
the particle will emit radiation of total energy E — E’ while traversing a layer 
of thickness dt. This assumption is justified for a light particle (we shall 
restrict our discussion to electrons) of very high energy in an absorber made 
of heavy atoms (Heitler, 1954). Thus for an electron in a lead absorber, the 
subsequent work will be quite useless at electron energies E < mc?, at which 
collisions are dominant; it will gradually improve as the kinetic energy in- 
creases beyond the rest energy mc”; and it will become practically exact at 
very high energies, from the order of 10 mc? upwards. Following general 
usage we describe this range of validity by introducing a critical energy E, 
such that for energies higher than E, the effect of collisions may be neglected. 





* Now at Trinity College, Dublin. 
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A little reflection on the meaning of the probabilities 7(E; 1) and 
R(E’|E) for an ensemble of impinging electrons shows that z(E; 1) must 
be a solution of the equation 


" 2, E 
on (ES) _ — (E31) J REE) dE’ 


Ae fw (E’; t) R(E|E’) dE’. (1) 
E 


This equation is one of the cases considered by Ramakrishnan and Mathews 
(loc. cit.) and it also follows from the well-known set of equations for a cas- 
cade shower (Janossy, 1948) when collisions, pair-creation and annihilation 
are not involved. The solution of (1) will, in the present approximation, 
be physically meaningful only when E > E¢. Nevertheless, the integration in 
the first term on the right-hand side of (1) must be extended into the region 
E’ < Eg, since an energy loss E — E’ such that E > E, > E’ does contri- 
bute to d7/dt in the relevant region E > Eg. 

In solving equation (1) Ramakrishnan and Mathews we. cit.) have 
employed two restrictions. 


(i) The initial energy of the impinging electron is precisely known or, 
in terms of an ensemble, the impinging beam is monoenergetic; the energy 
distribution at the absorber boundary is a delta-function. 


(ii) The emission probability is such that R (E’|E) dE’ can be written as 
R (q) dq, where g = E’|E. It is this condition that permits a successful appli- 
cation of the Mellin transform. We observe, however, that the emission 
probability for Bremsstrahlung (Heitler, /oc. cit.) acquires this mathematical 
form only after a further approximation has been introduced (see e.g., Janossy, 
loc. cit.) which, for heavy absorber atoms, affects electrons with energies 
appreciably above the critical energy E,. On the other hand, it is only 
when the absorber is made of heavy atoms that the collision loss of energy 
can be neglected in a wide energy region (i.e., that E, is small). 


We propose to solve (1) without postulating either of the restrictions 
(i), (ii). Thus we have a probability distribution of initial electron energies, 
a(E; 0), prescribed at the absorber boundary (t = 0). Let E, be the upper 
bound of the electron energy (which may be extended to infinity) so that the 
physically relevant range to be considered is E.< E< Ey. E, will play an 
important role subsequently in the transformation of variables. We shall 
require merely that both the probabilities R (E’|E) and 7(E; 0) be continuous 
in the domain Ey) > E> E’>0. The imposition of (ii) on R (E’{E) decreases 
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the accuracy of the corresponding solution of (1) for electrons with energies 
not very much higher than E,, passing through an absorber made of heavy 
atoms; for such electrons our solution should therefore be more accurate 
than that arrived at by using the Mellin transform. 


2. MATHEMATICAL TREATMENT 


On the basis of the preceding discussion we shall seek a solution of (1) 
in the domain E, > E> E’ > 0, with an arbitrarily prescribed boundary 
condition, 7(E; 0) at t=0. 


We shall first consider the asymptotic solution 7)(E; t)for high energies 
(E ~E,). In the high-energy tail of the distribution the second term on 
the right-hand side of (1) may be neglected on account of its narrow integra- 
tion range, the integrand being continuous; we are left with 


dm (E; 2) _ 


Ei) — — (B30) fR(E'|E) de’. 


The solution of this equation is 


E 
a (E; t) = 7(E; 0) exp[— tJ R(E’|E) dE’). (2) 
0 
Led by the form of (2), we now write the rigorous solution in the form 


n(B; t) = 4(B; 2) exp [— tf R(E'[E) dE’, 





(3) 
%(E; 0) = 7(E; 0). 
Substitution of (3) into (1) yields, after some rearrangement, 
e Eo 
vs) _ FREE) EE, ) (Es dE, 
(4) 


E’ E 

a (E, E;’ t) = exp [— ¢(J R(E"|E’) dE” — f R(E’|E) dB’). 
We now integrate with respect to /, taking into account the boundary condi- 
tion at f= 0; 


%(E; t) = 7(E; 0) 


+ fab’ FREEIE) (EE, (B's rar. (5) 
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Our aim is to reduce (1) to a Volterra equation with a known solution. Foy 


this purpose we introduce new variables E, E’ by the transformation 


a 


E’ Ey Meas E’ 


A little work with inequalities shows that the new variables are to be consi- 


dered in the triangular domain E, > E’ > E> 0. The Jacobian of the 
transformation is unity and (5) goes over into 


¥(E; t) = 7(E; 0) 


-f- 


© Sm py 
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Finally, we are free to introduce some large but finite upper bound to the 
thickness ¢ of the absorber, so that fp > ¢t> t' > 0, and fy may be put numeri- 
cally equal to Ey by a suitable choice of units. With these modifications (6) 
indeed becomes, in the domain under consideration, a Volterra equation of 
the second kind which has been treated in detail by Volterra (1913). Its solu- 
tion is given by an expansion of the Liouville-Neumann type, 


FE; 1) =7(E; 0 —fdE'f SK E,1;E, 17:04 (7) 


Qo i= 


where the kernels K; are given by 
K; = R(EIE) a (E, Er’); 
Ki(E, t; E, 2’) 
E Wii a —- 
om — J des R(E|x) a (E, x, y) Kin (x,y; B, 1’) dy (8) 
E’ 


for i >/. In order to compute the nth kernel it is not, however, necessary 
to know all the n — / previous ones; for it can be shown (Volterra, /oc. cit.) 
that the useful relation 


= — f t — — 
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is valid for all /<j < i. It has also been shown by Volterra that continuity 


of R (E|E’) and of 7(E; 0) is sufficient to ensure the convergence of the ex- 
pansion (7) in the domain under consideration. 


In terms of the original variables the rigorous solution of (1) is given by 
-_ E 

7(E; t) = 4(E, — E; t)exp[ — ¢J R(E’|E) dE’) (10) 
0 


where %(E; 1?) is given by (7), (8), (9). The solution (10) has physical signi- 
ficance in the energy region E.<E< E, and may be particularly useful for 
energies to E,, as discussed in Section 1. 


SUMMARY 


The integro-differential equation describing the energy distribution of 
very fast electrons (with negligible collision loss of energy) is reduced to a 
Volterra integral equation of the second kind. A rigorous solution, for an 
arbitrarily prescribed boundary condition, is given in terms of a convergent 
expansion of the Liouville-Neumann type. This method is applicable for 
transition probabilities of a form which is much more general than the form 
permitting a successful application of the Mellin transform. Since the reduc- 
tion of the emission probability for Bremsstrahlung to the latter form involves 
an approximation in a certain region, the present method may be used in order 
to obtain a higher accuracy in that region. 
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INTRODUCTION 


THE theory of stochastic processes is the natural outcome of the attempts 
to extend the scope of probability theory to “‘ dynamical” systems—the word 
‘“* dynamical” being used in its most general sense, i.e., to denote change with 
respect to a parameter which may be many dimensional (time representing 
a particular case of a one-dimensional parameter). It is surprising that while 
dynamics and probability calculus are old and well established branches of 
mathematics, almost treated as classical by the modern mathematician, sto- 
chastic theory is only of recent origin. More recent are the attempts at a 
systematic application of the theory to physical problems. Two reasons can 
be adduced to explain the delay in such attempts; (i) the inadequate acquain- 
tance of the physicists with the abstract formulation of the fundamentals of 
stochastic theory, (ii) the inability to formulate in precise mathematical terms 
the physical conditions imposed in any physical problem. The first of these 
is slowly disappearing with the increasing attempts by the physicists and the 
applied mathematicians to acquaint themselves with the mathematical for- 
malism. The task is made easier by the very recent appearance of a few much 
needed text-books which explain in simple and clear terms the results embodied 
in “high brow” mathematical formulations. Particular mention must be 
made of Bartlett’s book which is a very valuable introduction to those inte 
rested in acquiring a “ serviceable knowledge ” of the theory to be applied to 
specific physical problems. But the difficulties in the transformation of the 
physical nature of the problem into precise mathematical terms are more 
fundamental and cannot be removed easily without the determined effort 
on the part of the physicist to devise operational techniques as and when 
the need arises—a need which cannot otherwise be foreseen in a mere attempt 
to extend the scope of the theory by abstract logical reasoning. It is here 
that the theoretical physicist steps in to play his role not only by applying 
mathematical methods for a more profitable study and better understanding 


* This discussion is based on a one-hour address at the Annual Meeting of the Indian Academy 
of Sciences, 1955, and on lectures delivered at the Massachusetts Institute of Technology in the 
summer of 1956. 
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of physical phenomena, but supplying incidentally, food for mathematical 
thought in the course of his attempts to develop what may be seen at first 
non-rigorous, albeit useful, operational techniques to solve particular prob- 
lems. 


We shall therefore develop a purely physical theory of stochastic pro- 
cesses which it is hoped will serve as an introduction to a new entrant to the 
subject. A comprehensive article on ‘ probability and stochastic processes ’ 
is under preparation.f 


Incidentally we shall refer to some preliminary difficulties that face us 
in the task of the precise mathematical formulation of the physical problems 
even at the risk of being told that the answers have been implied in the exist- 
ing mathematical literature. The purpose of the present discussion will be 
more than amply fulfilled if such an elucidation is forthcoming. Answers 
to some of the difficult questions are also put forward using phenomenological 
arguments in the hope that they can be cast into proper form at the hands of 
a pure mathematician. 


1. GENERAL PHENOMENOLOGICAL DESCRIPTION 


We shall now describe from a purely phenomenological point of view, 
the principal features of a stochastic process in comparison with a determi- 
nistic dynamical system, varying with respect to a one-dimensional para- 
meter t. A deterministic dynamical system is assumed to be capable of exist- 
ing at any instant 7, in any one of an aggregate of states called, for obvious 
reasons, occupation states which may be finite, or enumerably infinite or con- 
tinuous infinite in number. The state of the system at ¢ is defined by speci- 
fying the occupation state in which the system is at that time. If we assume 
t to be a continuous parameter, we wish to study the variation of the state of 
the system with ¢. Since we have postulated a deterministic system the state 
of the system at ¢ is a function of t. If an occupation state is defined by a 
number, this number is a function of ¢ and if we change ¢ to ¢ + dt, the num- 
ber must change by an infinitesimal quantity proportional to dt at almost all 
texcept at a finite or enumerably infinite number of points. If the set of states 
were discrete and ¢ continuous we shall show that either the system can change 
only at a finite number of points in a finite interval, the system remaining static 
between any two such changes, or we are forced to the conclusion that the 
system changes infinite number of times in an infinitesimal interval. The 
first possibility is trivial while the second we exclude due to physical con- 
siderations. Thus for a deterministic system, if we ignore the first possi- 





+ To be published in the Handbuch der Physik (Springer-Verlag) in 1957, 
A6 














430 





ALLADI RAMAKRISHNAN 


bility, a discrete set of states is incompatible with continuous ¢. This can 
be illustrated by a simple example. If we have only two possible occupa- 
tion states defined by two colours ‘ blue’ and ‘ red’, is it possible to conceive 
of the colour as a function of t? For, if at t we have ‘ blue’ then at ¢ + 
if there should be a change, it should be ‘ red’ and thus we are led to the 
interesting conclusion that in a finite period of time which can be made as small 
as we please, there will be infinite changes, not taking into account that we 
get into insoluble conceptual difficulties in dividing dt into sub-intervals, 
The only possibility therefore left for us is that any interval (0, £) is to be 
divided into finite intervals and in each interval the state of the system con- 
tinues to be the same. This would mean that except at a finite or enumerable 
number of points the system does not change. 


The normal problem of dynamics stated in the most general terms is 
to study the variation of the dynamical system, in a finite interval if we know 
the mode of variation in an infinitesimal interval. This is expressible in the 
form of differential equations. As soon as we write a differential equation 
with differential coefficients referring to the variation of a quantity defining 
the occupation states with ¢, we have conceded that the state at ¢ + dr is 
completely defined by the state at ¢ and the mode of transition is defined by 
the physical conditions of the problem. If we are given the state at t=0 
and we wish to obtain the state at rf, we integrate the differential equation. 


In a stochastic process, at ¢ we cannot say whether the system is in any 
particular occupation state. The system can occupy any one of the occupa- 
tion states with different probabilities and we therefore speak of the prob- 
ability that the system is in a state § at ¢. This is defined by the probability 
frequency function at ¢. If an observation were made it will be found in one 
of the occupation states which we call the realised state at t. We wish to 
know what happens to the system when we change ¢. The probability dis- 
tribution in the various occupation states of course changes with f and one 
of the objects of the theory of stochastic processes is to predict the prob- 
ability distribution in the occupation states at ¢ given the initial distribution 
at t=0. A more detailed study can be made by determining the joint prob- 
ability distribution of the system in the occupation states at a finite number 
of points on the f-axis, the number being made as large as we please. It is 
obvious that it is almost an intractable problem (except in some simple cases) 
to obtain the joint distribution at all points in the interval (0, r) since they form 
a non-enumerable set. However if we plot in any particular experiment the 
realised state of the system at 7 as 7 varies from 0 to ¢ we obtain the realised 
curve of the trajectory of the stochastic process. It is difficult to ascribe a 
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measure to this trajectory except in some simple cases since we meet with the 
pathological problem of the joint distribution at a continuous infinite 
number of points. 


Turning to the simpler task of determining the probability distribution 
at a particular ¢, by analogy with the dynamical system, we should know the 
variation of this distribution for an infinitesimal change of t. Here we meet 
with a difficulty which has no parallel in deterministic systems. The change 
in the probability frequency function (hereafter referred to as the p.f.f.) 
of the system as ¢ varies from ¢ to t + dt need not be completely specified by 
the state at ¢. This complication may be explained in clear terms if we state 
that at t = 0, the system has been observed to be in a particular state. So, 
our object is to determine the p.f.f. at ¢. In many cases the specification of 
initial conditions may not be sufficient to derive this p.f.f. as the development 
of the process between 0 and ¢ may depend upon the ‘ history ’ of the process 
prior to f = 0, i.e., we may need information about the realised states of the 
system prior to = 0. In other words if a system was observed to be say in 
the state S’ at t = — a and S at t = 0, it may evolve in a different manner 
from a system which was observed to be in S” at ¢ = — aand at Sat r=0. 
If the development of a process from ¢ = 0, is independent of its history prior 
to t= 0, we call it Markovian. All other processes are non-Markovian and 
it is clear that all types of non-Markovian processes cannot be enumerated 
and they can only be conceived of as a residuary class. The distinction bet- 
ween Markovian and non-Markovian processes does not arise in determinis- 
tic systems since in those cases the occupation state in which the system lies 
iscompletely determined by ¢. 


Throughout this paper we shall confine ourselves to Markovian processes. 
This does not mean that non-Markovian processes are completely neglected. 
From the previous discussion, it is clear that the p.f.f. of the system refers 
to the occupational states. By redefining a proper set of occupation states 
a process which is non-Markovian when we consider the first set, may be 
viewed to be Markovian if we consider the system to exist in another suitably 
defined set of occupation states. This technique is well known and cannot 
be cast into any general form but is easily illustrated by taking particular 
examples (see Alladi Ramakrishnan, 1955). 


In a Markovian process, we are interested in determining 7(S|S,; 1, t,) 
the probability that the system is in S at ¢ given that it was in Sp at fy (to< 1). 
If in addition we assume the process to be homogeneous in f, 7 is a function 
only of t—%) and we can set fy =0 and write 7(S|S,; 1) instead of 
™(S\So; 1, to). To determine ~, our first task is to find out what happens 
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in an infinitesimal interval to a system which is known to exist at S, at t =0, 
This amounts to knowing the lim 7(S|S); t) as t-> 4—+0. If we make an 
observation at t = 4, it will be found to be in any one of the many possible 
states. This transition from Sp to S occurs with probability lim = (S|S,; 1) as 
t—» 4-+0. The knowledge of this limit gives us a complete picture of what 
happens in the infinitesimal interval and it is therefore possible to express 
m(S|So; ¢) for finite ¢, in terms of this limit. This is expressed in the form of 
a differential equation with respect to ¢. If the occupation states are discrete 
in number summations may occur with respect to them, and if they are 
continuous infinite in number and describable with the aid of variables in a 
continuous domain, integrals with respect to the variables will occur. This 
equation is known as the forward differential equation of the process. 


Let us now consider the nature of the lim 7(S'|S; t) as t+40, 
for two cases. 


(i) When we have a discrete set of occupation states, 
(ii) When the set of occupation states is infinite. 


(i) If S is one belonging to a discrete set say S,, S., S5, ..... we have 
ruled out by phenomenological arguments any deterministic change from any 
S to any other state in the interval 4. On the other hand, the system may 
jump from S; to S; in the interval with probability proportional to 4. This 
condition of proportionality to 4 “ safeguards” the process against infinite 
changes, in a finite interval which can be made as small as we please. If 
an experiment were performed between 0 and ¢, and the states plotted, the 
number of realised transitions will be finite and the probability of infinite 
number of transitions vanishes. Thus in a stochastic system “* jumps ” from 
one state to another are possible in an infinitesimal interval with infinitesimal 
probabilities. 


In the case when we have a discrete set of states, let us write for con- 
venience 7(k\/; t) for m(S,|S,; t) and assume 
a(k\l; t)—>R(k|D 4 as t—>4—-0fork4l (l) 
a(k\l;t)>1—-ADR(GY) as t>4-+0 for k=/ 
j,ixl 


ie., the probability of change in an infinitesimal interval 4 is proportional 
to 4 while the probability of “continuance” is 1—0(4). a can now be 
shown to satisfy the forward differential equation 
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Since 7 (j\i; t) is a probability magnitude, 7(j\i; *)>0 for all ¢, j, 7; 
equally so R(j\i)>0 for all j, (fi). If RG\i) > 0, note that 7 (ji; 4) 
is 0(4) as t—» 4-+0, but it is finite and greater than zero for finite ¢t. For 
convenience we divide the above type of stochastic processes into two groups. 


(1) When R (j\i) > 0 for all j, i, i.e., the system is ‘ connected ’ for infini- 
tesimal transformations. In this case, while 7(j|i; ¢) is 0(4) if t+4—>0 
it is finite and greater than zero for finite ¢. 


(2) When R (j\i) = 0 for some j, isay j+,i*. Even in this case, the author 
has shown (Ramakrishnan, 1956) that 7 (j+|i+; t) > 0 provided for every pair 
of states j+, it for which R (j*|i+) = 0, there exist a sequence of states Sy, 
ines , Sk, Such that each of the terms 


Rsk.) BR Gaiee-ab-----: » R(k.\k,), R (Ki) 


is greater than zero. In such a case we say the system is completely connected 
for finite transformations and not for infinitesimal transformations. 


(ii) If the occupation states form a continuous system describable by a 
one-dimensionalt variable say FE, in a deterministic system, E is a function 
of t. Infinitesimal changes of ¢ will cause infinitesimal changes in E, but in 
the case of a stochastic system, the system in a state E,, can jump to a state 
between E, and E, + dE,, E, — E, being finite with probability proportional 
to 4, in an interval 4. A stochastic system may be characterised not only 
by these possible changes but also may involve deterministic changes. Thus 
in defining the lim 7 (S’|S; 1) as t—> 4 +0, we must first unravel the possible 
stochastic transitions from the deterministic changes. This is necessary only 
when the S states form a continuous system, now defined by the one-dimen- 
sional variable E. Thus replacing the S’s by the E’s, we can safely state that 
n(E\E);t) tends to a lim R(EIE,))4 as t-»4-+0 provided |E, — E| 
is finite since it will refer only to stochastic transitions. If no stochastic transi- 
tion occurs, then the change in E is determined by the mode of the determinis- 
tic change characteristic of the process. 


Consider the infinitesimal changes in E for an infinitesimal change 4 
int. It is physically reasonable that this change is proportionate to J. 
Since 4 is an arbitrary increment and what is true of 4 must be true of 4/2, 
4/3, 4/4, etc. But the factor of proportionality can be either (i) a deter- 
ministic function of E or tf or (ii) a stochastic variable with a defined frequency 
function, (iii) more generally, a random function of t. We shall for the 


t Actually there is no restriction on the number of dimensions. The extension to a many 
dimensional variable presents no difficulty in the description of the process, 
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moment disregard the last two possibilities and consider only the first one. 
Even in that case we shall assume that the factor of proportionality is not a 
function of t for as soon as we concede that it is a function of t we tacitly 
assume that ¢ has to be measured from a particular point and this amounts 
to a knowledge of the prehistory of the process disturbing the Markovian 
property. For simplicity therefore we shall assume that the factor is a deter- 
ministic function of the realised value of E at t which we shall denote by 
S(E). f(E) need not be a continuous function of E, but should be bounded 
provided it is understood that whenever differential coefficients of 7 are in- 
volved we introduce suitable delta functions. We can now write the sto- 
chastic forward differential equation for the probability frequency function 
as 

dm (E|Ep; t) _ 


; —n(E\E; t) § R(E’|E) dE’ 
t Ev 


+ J 7 (E'|Eo; #) R (E|E’) dE’ 


“ae { f(B) = (E|Ea; »} (3) 


where f(E) dt is the deterministic change in E, if no stochastic transitions 
occur. Such a process we call a “ blended” process. If f(E)= 0, the author 
has called the process a “ basic random process ”’. 


In the above discussion we take into consideration either a discrete set 
of states S,, S,, ..... or a continuous set defined by a single variable E. It 
must be mentioned the state can be defined with the aid of as many discrete 
and continuous variables as are necessary for the description of a physical 
system and in this case the stochastic differential equation will naturally be 
more complicated than the one given above. But our point here is merely 
to mention that the stochastic equations in the case of the continuous set may 
consist of two sets of terms, one denoting stochastic and the other deter- 
ministic changes. A large freedom is allowed to define the basic occupation 
states and this usually depends upon the ingenuity in adopting the physical 
theory to actual problems. 


2. A Process DEFINED BY THE FORWARD DIFFERENTIAL EQUATION 


A typical trajectory of a process defined by (3) in the interval 0 to ¢ will 
be characterised by finite jumps at a finite number of points (due to the realisa- 
tion of stochastic transitions) and a continuous curve elsewhere. As men- 
tioned earlier, it is possible to have discrete deterministic changes at a finite 
number of points on the f-axis but in this case since no changes occur between 
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these points it is of no interest for our discussion and we therefore exclude 
this possibility. If in addition we do not postulate infinitesimal determinis- 
tic changes for infinitesimal changes in ¢, i.e., f(E) =0 the realised value of E 
will be constant between two transitions. Such a process has been called a 
basic random process. The measure of a typical trajectory of either a basic 
random process or a simple blended process with transitions at 4, fs, ..... 
is just the probability that the transitions take place at those points. The 
mathematical difficulties of assigning a measure to a typical trajectory of a 
random function do not occur in the simplified case. Thus we can say that 
provided we know fand R (E|E’), all the characteristic features of the process 
can be studied. 


Till now we considered ¢ to be continuous. f can also be assumed to 
be discrete. If ¢ is discrete and the set of occupation points continuous the 
process can be only stochastic. For if it were assumed to be deterministic, 
it is equivalent to stating that for each value of ¢ these correspond one occu- 
pation state (if the state is a single valued function of ¢) or a finite number. 
Thus corresponding to all the values of ¢ there can be almost an enumerable 
number since f¢ is discrete. Thus the set of occupation states cannot be con- 
tinuous. Denoting the discrete set of t by ¢,, fa, ..... and using the con- 
tinuous parameter E to represent an occupation state, we define R (E’|E; 
tnats tn,) dE’ as the probability that the system is between E’ and E’ + dE’ 
at fyi, given that it was in E at fy. 


If the set of occupation states is discrete (say denoted by S,, S,, ...) it 
is obvious that the process can be either deterministic or stochastic. If the 
process is stochastic we define R (j|i; tps, tn) as the probability that system 
jumps to S; at fps given that it was in S; at f,. 


Summarising our results we can classify processes from the point of view 
of the nature of occupation states and of the parameter 1. 


(A) Occupation states—discrete set, ¢-discrete; a process can either 
be deterministic or stochastic or ‘ blended ’. 


(B) Occupation states—discrete f-continuous; process can be only 
stochastic. 


(C) Occupation states—continuous set, f-discrete; process can be only 
stochastic. 


(D) Occupation states—continuous, f-continuous; process can be 
either stochastic or deterministic or ‘ blended’, 
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RECURRENCE AND FIRST PASSAGE TIMES§ 


A more limited purpose than studying the typical trajectory is to con- 
sider first passage times, and recurrence times for any particular state. We 
shall treat the problem from a phenomenological point of view and explain 
in physical terms the difficulties involved by considering the classes of stochas- 
tic processes listed above as A, B, C and D one by one. 


Class A.—We shall denote the states by S; and the discrete t-axis by t, 
where the f;,’s are ordered. Given that at ¢, the system is in S;, we ask what 
is the probability F (ji; ty, tj) (k > J) that the system enters Sj for the first 
time at ¢,. In this case if a system is in a certain state at some #; and con- 
tinues to be in the same state at 7;,, we shall not treat it as continuance but 
treat it as a “fresh” entry into that state. F satisfies the equation 


k 
m (ji; ths ty) = aF (iis ths ta) 7 UNS3 thes tn) (4) 
m (ji; tia, t) = F Gis tu, to) (5) 


By successive substitution, we obtain the probability of first passage at all 
values of t. Note that F (j\i; ti, t)) corresponds to the R functions in the 
case of continuous t¢. 


The probability of recurrence of a state S; at t, given that it occupied 
S) at t; is obtained merely by taking j = / in the expression for first passage, 
F(j\l; ty, ti). Thus the problem of determining the distribution of first 
passage and recurrence times is straightforward in this case. 


Class B.—In this case a system is in S; at ¢ and at ¢ + dt it continues with 
probability 1 — 0 (dt) and continuance at t+ dt should not be treated as 
an arrival at this state in the interval dt. We have therefore to define 
F (j|;i t) dt as the probability that S; is entered for the first time in the inter- 
val dt given that the system was at S; at t= 0. Itis clear that 7 should not 
be equal to j in the above definition. 7 will satisfy the equation 


m (ji; 1) = SFG 7) a(j\j; t— 7) dr (6) 


F is determined if we know z and 7 satisfies the well-known stochastic equa- 
tion (2). 


§ Not much work from a physical point of view has been done in this aspect of stochastic 
theory and Bartlett’s Chapter on this problem deals with only the beginnings of this subject and 
does not advert to some of the difficulties in the case of complex stochastic processes. But he makes 
specific reference to this in his recent paper (Bartlett, 1953), 
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To obtain the recurrence time, we require the probability p (j|j; 1) dt 
that S; recurs in the interval dr. It means that the system has jumped off 
from the state S; between ¢ and ¢ + dt and is again entering S; for the first 
time since its exit from that state somewhere between 0 and f¢. 


p(ilis Ndt= dt f CU; DEREK) GIKs t—2) dr (7) 


where R (j|k) has been defined earlier and C(j; 7) is the probability of con- 
tinuing in the state S; in the interval 0 to 7. This is given by 


C(j; 7) = exp — [2 R(klj) 7] (8) 


p is therefore completely determined, if we know R, = and F. 


Class C.—This is exactly similar to Class A, except that we have to define 
F(E\E’; tx, t)) dE, (k > 2) the probability that the system enters the occupa- 
tion state between E and E + dE at ¢, given that it was at E’ at ¢;. The other 
arguments run on similar lines and so integrals suitably replace the summa- 
tions in Class A. 


Class D.—This is the most interesting, especially in view of the difficul- 
ties that occur in the case of “‘ blended” processes. ‘We therefore shall con- 
sider the problem in increasing order of complexities. 


(i) Basic random process.—Consider the system to be at E, at r=0; 
we wish to know the probability F (E|E,; *)dEdt, (E ~ E,), that a state 
between E and E + dE is entered for the first time between ¢ and t+ dt. In 
this, we get the most remarkable result that the system enters the state between 
E and E + dE between 0 and f¢ then jumps off and re-enters a state between 
E and E + dE between ¢ and t + dt, is of the order of (dE)*.dt while the 
first passage between t and t + dt is of the order of dEdt. Hence we can 
write 


F(E|Eo; ) = Sm (E’|Eo; t) R(E\E’) (9) 


Compare this with equations (4) and (6) where F occurred as a ‘ kernel’, 
i.e., aS a term within the summation or integral signs. The integral equation 
for 7 corresponding to (6) can be written in the form, 


a(E|Eo; 1) =J F(E|E; 7) C(E; t— 7) dr (10) 
where C(E; 7) is the probability that a system which is found in E at t= 0 


continues to be in the same state tillt = +. Differentiating the above expres- 
sion and substituting the value for F according to equation (9) we obtain 
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dm (E|Eo; t) _ 


d — (E|Ep; #)  R (E'|E) dE’ 
t Fd 


+ J (E'|Eo; 2) R(E|E’) dE (11) 


which is the standard equation for a basic random process. 





The problem of recurrence times does not occur in this case since we 
have shown that the probability of recurrence is of higher order of smallness, 
O (dE)? dt, as compared to the probability of first passage O (dE) (dt) as is 
also evident from the derivation of the integral equation for z. Hence the 
probability of recurrence can be taken to be zero. This striking result we must 
note is due to infinitesimal nature of dE and not that of dt. Hence if we take 
a group of states between E and E + 4E, 4E being finite the problem of their 
recurrences and first passage times into this group is not so simple as the one 
considered above. The computation of this leads to almost intractable diffi- 
culties as a division of the E space into finite intervals will render the process 
non-Markovian. 





Simple blended processes—-Here we have postulated an infinitesimal 
deterministic change of f(E) dt for a change dt in t¢ if the system is in a state 
E at ¢. Of course we assume the possibility of random transitions from E 
to an interval between E’ and E’ + dE’ (E — E’ being finite) in the interval dr, 
with probability R(E’|E) dtdE’. A typical trajectory is characterised by 
a curve which is continuous representing deterministic changes, except at 
a finite number of points where random transitions have been realised. 
Phenomenologically there is no meaning in speaking of first entry into a state 
between E and E + dE between t and ¢ + dt because this state is entered in 
two ways either by a jump from some state E’ to the interval between E and 
E + dE, in a random transition, or by crossing the state E in the interval 
tandt-+dt. A slight reflection will show that the two concepts of crossing 
and jumping cannot be defined by the same function and hence a proper defi- 
nition of first passage and recurrence breaks down unless in considering 
jumping we do not consider crossing and vice versa. 


The same problem will arise if the finite random transitions are blended 
with infiinitesimal random changes. If the finite random transitions are absent 
and only infinitesimal random changes are postulated we can define a func- 
tion F(E|E,; t) dt, the probability that the state E is crossed between ¢ and 
t+ dt. (Note the absence of dE.) This problem is dealt with presently 
in the following treatment on integrals of random functions, 
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Integrals of random functions.—The possibility of infinitesimal random 
changes for infinitesimal changes of 1, led the author to a phenomenological 
definition of integrals of random functions. The concept of stochastic inte- 
gration is shrouded so much in abstract mathematical language, that it is 
not possible for one who is interested in its physical interpretation to glean 
from such rigorous treatment the essential physical basis which is necessary 
in applying any theory to concrete problems. The author therefore felt it 
necessary to attempt this problem from the beginning using only the concept 
of the realised curve or the trajectory of a stochastic process. As the pheno- 
menological theory has been elaborated in a series of three papers by the 
author, (Ramakrishnan, 1955) we confine ourselves only to the fundamental 
definition of an integral of a random function and explain in general terms 
how it can be applied to physical problems. We now consider the postulate 
of infinitesimal random changes in infinitesimal periods of time. If X*(r) 
be the realised value of the random function at t, if we increase ¢ to t + dt 
our postulate requires that X* (t + dt) — X® (1) is 0 (dt) or more precisely 
K dt, K being the realised value of a random function of ¢t, say ¢(t). This 
random function of t can be symbolically represented as the differential co- 
efficient of X(t), i.e., 


6) =4 X00. (12) 


At first sight it would seem that it is the knowledge of the random function 
X(t) that determines its differential coefficient. A closer examination will 
show that it is more logical to treat X (t) as an integral of the random function 
¢(t) and a knowledge of ¢ (ft) is necessary to study X (1). The point can easily 
be illustrated by asking for the distribution of function ¢ (ft), given the nature 
of X(t) or vice versa. 


We shall assume that the joint distribution of X (f) at two points f, and ft, 
is known and is specified by the function 7(X,, X2; 4, f:). We shall not 
assume that X(t) is Markovian. Therefore the joint distribution cannot 
be expressed simply in terms of 7 (X; 1), the distribution function at t. The 
question of arriving at this distribution function will be discussed later. But 
for the moment we assume that we are provided with such a distribution 
function. Defining X(t.) — X(t) as a new random function ®(t,, f.), the 
joint frequency function of X (t,;) and X (t.) immediately determines the joint 
frequency function of X (t,) and ® (,, tf.) and equally X (t,) and ® (1,, t,)/t, — be. 
If we now let t, > 4, and postulate that the values which © (¢,, f.) can assume 
a t,—> ft, is 0(t, — t,), then we can obtain the joint distribution of X (t,) and 
P(t, ty)/tg—ty as ty > t,. It is reasonable to denote lim 1, > t,  (t,, t2)/t. — ty 
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symbolically by the random variable X (t,) = 4(t). Since we are now in 
possession of the joint distribution of X (rt) and X (f), by integrating this func- 
tion over the values of X we obtain the frequency function of X (7). 


Thus from the above discussion we obtain the lemma that the joint dis- 
tribution of X(t) at two points helps us to obtain the distribution of X (ft). 
But the important question remains as to how the joint distribution of X (t,) 
and X (f,) can be obtained or derived or at least the distribution function of 
X(t). In the opinion of the author it is more logical and phenomenologi- 
cally more satisfying to conceive of X (f) as an integral of ¢ (t) and a knowledge 
of any ¢(f) helps us to obtain the knowledge of X(t). Accepting for the 
moment this point of view if ¢(¢) is characterised by infinitesimal random 
changes in infinitesimal periods of time, we should view the process as an 
integral of another process and so on till we reach a process which is a random 
process which is not characterised by infinitesimal random changes. It is 
reasonable to suppose that it is a basic random process. Defining it for the 
moment as Y (ft) we can build a whole class of processes characterised by 
infinitesimal random changes in infinitesimal periods of time, by defining 
symbolically that class as 


t n—1 
Yn (0) = bn (0S bra Gn) a a ee 


Ibo (7) X (79) dro (13) 


where ¢,(t), 4, (1), 42(f), etc., are fully deterministic functions of t. The 
author is not suggesting that any random process characterised by infinitesi- 
mal random changes in infinitesimal periods of time can be represented by 
an integral like (13) but only remarks that an entire class can be defined by 
the above integral and therefore by deterministic functions of such integrals 
and simple combinations of such functions. It is clear that if we assume the 
basic random process to be Markovian, the integral process Y,, (t) is non- 
Markovian since the development of Y,, (¢) requires a knowledge of Y,,_; (t) 
and so on and only if we consider the process represented by the entire aggre- 
gate Y, (ft), Yni (4) ..-. ¥i (9 and X(t), we obtain a ‘ quasi’ Markovian 
process (the significance of the word ‘ quasi’ will be explained later) where 
the development can be studied by standard methods. From the above 
considerations it is clear that even if we require the distribution function of 
Y,, (t) alone, we must first obtain the joint distribution of the aggregate 
Y,(t), Yni(0, .-.. and then integrate over the intermediate variables, 
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This procedure has been adopted successfully by Srinivasan and Mathews 
(1956) in the case of a class of basic random processes. 


There is however an alternative treatment available if we are interested 
only in the distribution of Y,,(t) in the special case when the basic random 
process is of a very simple type with certain symmetric properties. We shall 
not however go into that method in any detail here, but merely indicate one 
of the important results that was obtained in the course of the application of 
that method. It led to the concept of ‘equivalent’ processes. A process 
is said to be equivalent to another if the distribution function of the two 
stochastic variables Y (t) and Y* (t) under consideration are identical at all 
instants of time. Note that we are not stating that the joint distribution of 
Y (t) at more than one point say ¢, and ¢, is identical with the joint distribu- 
tion of Y* (t) at the same points ¢, and fz. We only emphasise that the distri- 
bution function of Y(t) at ft, is identical with the distribution function of 
Y*(t) at t,, as ¢, takes any value. This result is probably implied in many 
abstract treatments of the theory of stochastic processes. But the author has 
not seen any particular illustration of this rather remarkable feature of sto- 
chastic processes. This has no parallel in the case of static distributions. If 
we have two stochastic variates x and y, and they have identical distributions, 
we say that they display the same statistical features. But in stochastic pro- 
cesses, we do not have just a stochastic variable, but a random function X (ft) 
instead of x and Y (t) instead of y, and these represent a continuous infinite 
sequence of stochastic variables and the distribution function deals with one 
of them “‘ at a time ” and hence any information only about the distribution 
function cannot give us a complete picture of the stochastic process. The 
author has succeeded in citing a number of examples of such equivalent pro- 
cesses and we refer the reader to the series of papers on integrals of random 
functions (1955). 


REMARKS ON MARKOVIAN, NON-MARKOVIAN 
AND QUASI-MARKOVIAN PROCESSES 


We have referred to the integrals of random functions as non-Markovian 
processes and we have emphasised earlier that non-Markovian processes 
are a residuary class and cannot be exhausted by classification. Integration 
is one way of generating non-Markovian from Markovian processes. But 
it belongs to that class of non-Markovian processes which by the introduction 
of additional parameters can be rendered quasi-Markovian. The difficulty of 
defining a general Markovian process can easily be illustrated in the follow- 
ing manner. Suppose X(t) is non-Markovian and ¢ is continuous. This 
means that if we know the value of X at some value of ¢ it is not possible to 





442 ALLADI RAMAKRISHNAN 


predict what happens between ¢ and ¢ + dt merely if we are in possession of 
the knowledge of the value of X(t). What other information do we require ? 
Let us postulate that we require information about the realised value of X (7), 
+ <t: If we require this information about all the states in a domain 7 < zt, it 
leads to the mathematically difficult and possibly intractable problem of 
defining a function of a continuous infinite number of points or variables 
represented by the value of 7. Or to put it more precisely, we can speak of 
the function of a variable 7 or of an enumerable number of variables 7,, 7.,. .., 
etc., but not of a continuous infinity of variables in an interval fy) < 7 < t. 
An interesting class of non-Markovian processes can be cited easily which 
have a simpler dependence on previous history. Suppose we have the dis- 
tribution of random points in a line represented by the axis and we postulate 
the following law of distribution. The probability that a point lies between 
7, and 7, + dr, is equal to #(7, — 7) dt,, given that the previous point has 
occurred at + and does not depend upon the distribution of points prior to r. 
To put it picturesquely the memory of the process can be traced only upto 
the last realised point. Let a point be realised at r = 0 and let us require 
a(n; t) the probability that there are n points in the interval 0 to ¢. 7(n; 1) 
is non-Markovian since it is impossible to predict the probability whether 
a point will lie between ¢ and ¢ + dt, if we are just given that n points lie bet- 
ween 0 and t. We have to know where the nth point has occurred. How- 
ever it possesses a very interesting property which we have not adverted to 
till now. It is “‘ regenerative ’’ with respect to an occurrence of a point. Or 
in other words, when a point occurs the process “loses all memory” and 
starts ‘‘ afresh’. Regenerative processes first attracted the attention of Bell- 
man and Harris and has been applied with success to many important pro- 
cesses later on. Bartlett in his book has given an interesting definition of a 
regenerative process in fairly general terms. If a system can occupy a dis- 
crete set of states S,, S,, ...., etc., and we are interested in the stochastic pro- 
cess which is described by defining the occupation state in which the system 
lies at parametric value ¢ as ¢ varies, we call it non-Markovian if, given that 
it is in a state S; at ¢, we are not able to predict the development of the process 
between ¢ and ¢ + dt. However, if among the aggregate of occupation states 
there is a sub-aggregate S,’, S,’, S,’, ...., etc., and if the system is found in 
any one of those states at f, it is possible to predict what happens between 
t and t + dt, we will call the process regenerative with respect to these states. 
This definition is fairly general and has to be adapted to particular circum- 
stances. In fact some fundamental difficulties occur in translating this gene- 
ral description into exact terms in physical problems. The regeneration 
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process referred to above which is so delightfully simple becomes quite com- 
plicated from the view-point described above. If we define the occupation 
states as the number of points 0, 1, 2, etc., the system is not regenerative with 
respect to anyone of the states but we know that it is regenerative with respect 
to the occurrence of the point. So let us divide the interval 0 to ¢ into a 
large number N of small and equal intervals of length 4 and N4 =f. In 
every interval there can be 0, 1, 2, 3, etc., random points as 4 becomes very 
small. Let us postulate that the probability that there is one random point 
is0(4) and no random point is 1 — 0 (4) while the probability of k random 
points is O ( A*) which is small compared to 0(4). We therefore define two 
occupation states which are characterised by the number 0 and | the number 
of random points in the interval. The state at ¢ is to be interpreted by taking 
a small interval of width 4 in the neighbourhood of ¢ and asking the question 
: whether there is a random point in it or not and the system is now regenera- 
! tive with respect to the occupation state 1, representing the number in 4. 


It is quite obvious that a Markovian process is always a regenerative 
process, the converse of course not being true. It is possible to study re- 
generative processes by an elegant mathematical technique though they are 
] not Markovian. The technique is suggested by the regenerative property 
itself. Suppose we are interested in the probability distribution function 
of the system at ¢, we first assume that the system is in a state S; with res- 
pect to which the process is regenerative. We now study how the system for 
; the first time enters another regenerative state S;, at some point between 7 
: and + -+dr. At that point 7 the process ‘loses memory’ and starts afresh 
with the new initial condition S; and the duration of this freshly started pro- 
cess is ft — rt. Since S, can represent any of the regenerative states and r 
can be any point between 0 and f, this approach leads to the integral equation 





1 
, m(Ijts t) =f OF (ket |jts 2) 2 (kets t — 2) dr. (14) 
- 0 kt 

In some cases it may be possible to express a (/|k*+; ¢) in terms of z (/|j)*; ¢ 
t and this makes the situation simpler. We must remember that /, j, k, ...., 
5 etc., refer to the states S;, Sj, S,, .. which can be interpreted in the widest 
S possible manner to suit the needs of the problem. Accordingly, the summa- 
n | tion sign over k+ must be interpreted as an integral ora sum. We also assume 
n that the first transition from j*+ can be only to a regenerative state. 
S. We have now described the general ‘ physical’ features of a stochastic 
I process evolving with respect to a one-dimensional parameter ¢, the Markovian 
a | processes being a simple class of such processes. If t were two or many dimen- 


sional, it will be difficult to define Markovian processes since ¢ cannot be 
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ordered and the meaning of “ previous history’ becomes obscure. In a 


‘ 


similar manner a “ regenerative ’’ property cannot also be easily attributed to 
such processes. A discussion on these difficulties and examples to illustrate 
the observations in this paper will follow in a later contribution elsewhere. 
(Handbuch der Physik, Springer Verlag, to be published in 1957). 


The author was encouraged to publish this work after his discussions 
with the probability seminar group at the Massachusetts Institute of Tech- 
nology which invited him to deliver a course of six lectures on the physical 
approach to stochastic processes. In particular his discussions with 
Dr. Bayard Rankin has helped him to cast this paper in a form which it is 
hoped may prove of some use to the physicist, without ignoring the demands 
of rigour. 
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